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In  the  early  to  mid-1980’s  a new  movement  in  analog  signal  processing  was  begun 
by  Carver  Mead  and  his  associates  at  the  California  Institute  of  Technology.  This  move- 
ment found  its  basis  in  modeling  neurobiological  signal  processing  with  “neuromorphic” 
silicon  systems.  Neuromorphic  system  design  consists  of  developing  models  of  biological 
processing  mechanisms  and  implementing  those  models  in  VLSI  circuits.  After  compar- 
ing differences  between  digital  and  biological  processing  systems  in  terms  of  power  con- 
sumption, area,  complexity,  and  functionality,  Mead  found  that  neurobiological 
processing  systems  are  orders  of  magnitude  more  efficient  than  digital  systems.  This 
observation  coupled  with  the  knowledge  that  VLSI  technologies  are  quickly  reaching  fun- 
damental feature  size  limits  led  to  a new  field  of  study  where  researchers  study  how  to 
make  computation  more  efficient  in  terms  of  power,  size,  and  cost. 

This  thesis  presents  a study  of  neuromorphic  edge  detection  architectures  and  their 
associated  implementation  issues.  Specifically,  the  designs  presented  address  mecha- 


nisms  for  improving  low-contrast  edge  detection  performance.  Namely,  several  architec- 
tural realizations  are  compared  and  contrasted  to  evaluate  which  has  superior 
performance.  In  addition,  one  algorithmic  technique  is  presented  which  improves  low- 
contrast  edge  detection  in  silicon  realizations.  Next,  the  architectural  components  are 
characterized  to  evaluate  the  magnitude  of  the  error  sources  associated  with  practical 
implementations  along  with  an  analysis  of  potential  error  minimization  techniques.  This 
thesis  shows  that  a Differenced  Gaussian  architecture  exhibits  superior  performance  to 
other  silicon  edge  detection  architectures.  Results  from  a fabricated  Differenced  Gaussian 
architecture  show  that  a 22  millivolt  edge  can  be  detected.  Results  from  a Differenced 
Gaussian  architecture  incorporating  a Non-Nearest  Neighbor  Differencing  circuit  show 
improved  edge  detection  performance  by  detecting  a 9 millivolt  edge  signal.  Measure- 
ments from  several  continuous -time  photoreceptor  and  transamp  buffer  topologies  demon- 
strate a capability  to  reduce  random  offset  variations  to  below  2 millivolts  to  one  standard 
deviation.  Lastly,  a Differenced  Gaussian  architecture  incorporating  floating-gate  buffers 
and  edge  detectors  demonstrates  an  edge  detection  capability  of  3 millivolts.  Throughout 
the  thesis,  the  results  from  actual  implementations  are  presented  to  support  the  theory  pre- 
sented. Lastly,  conclusions  are  drawn  from  this  body  of  work  and  directions  given  for 
future  areas  of  study. 


IX 


CHAPTER  1 

AN  OVERVIEW  OF  ANALOG  VISION  SYSTEMS 


The  Foundations 

In  the  early  to  mid- 1980’s  a new  movement  in  analog  signal  processing  was  begun 
by  Carver  Mead  and  his  associates  at  the  California  Institute  of  Technology.  This  move- 
ment found  its  basis  in  modeling  neurobiological  signal  processing  with  “neuromorphic” 
[1]  silicon  systems.  Neuromorphic  systems  mimic  biological  processing  mechanisms 
using  analog  VLSI  circuits.  After  comparing  differences  between  digital  and  biological 
processing  systems  in  terms  of  power  consumption,  area,  complexity,  and  functionality, 
Mead  found  that  neurobiological  processing  systems  are  orders  of  magnitude  more  effi- 
cient than  digital  systems.  This  observation  coupled  with  the  knowledge  that  VLSI  tech- 
nologies are  quickly  reaching  fundamental  feature  size  limits  led  to  a new  field  of  study 
where  researchers  study  how  to  make  computation  more  efficient  in  terms  of  power,  size, 
and  cost.  Many  researchers  have  since  become  intrigued  by  these  questions  and  have  ded- 
icated much  time  and  effort  to  the  pursuit  of  answers. 

Most  efforts  to  date  have  focused  on  implementing  these  neuromorphic  systems  in 
readily  available  CMOS  technologies  where  complex  computations  are  performed 
through  exploiting  the  fundamental  device  physics  of  CMOS  structures.  Utilizing  older 
CMOS,  “legacy”  technologies  not  only  ensures  lower  cost  designs  but  also  allows  for 
improved  performance  through  technology  scaling.  Also  since  power  consumption  and 
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computational  flexibility  are  key  design  issues,  most  MOSFET  circuits  are  operated  in  the 
subthreshold  region.  Subthreshold  operation  yields  low  power  realizations  due  to  small 
DC  operating  currents  and  it  provides  computational  flexibility  through  an  exponential 
drain  current  to  gate  voltage  relationship.  Subthreshold  MOSFET  operation  closely 
resembles  the  operating  characteristics  of  bipolar  transistors  allowing  many  functions  pre- 
viously limited  to  bipolar  circuits  to  be  realized  in  pure  CMOS  implementations  [2]. 
Moreover,  performing  computations  in  analog  circuitry  allows  designers  to  take  advantage 
of  both  the  parallel  and  the  real-time  processing  aspects  inherent  in  analog  design.  These 
advantages  yield  a fertile  foundation  for  designers  looking  to  generate  new  ideas  for  real- 
time solutions  of  computationally  complex  problems  which  previously  required  tremen- 
dous amounts  of  digital  computing  resources. 

The  two  most  active  areas  of  research  have  centered  on  the  auditory  [3],  [4]  and 
visual  systems.  Many  efforts  have  attempted  to  replicate  portions  or  functions  of  both 
auditory  and  visual  systems.  This  thesis  focuses  on  efforts  addressing  the  visual  system. 
Specifically,  this  thesis  addresses  new  algorithmic  techniques  and  circuit  topologies  for 
enhancing  edge  detection  in  neuromorphic  systems  for  improved  low-contrast  perfor- 
mance. The  remainder  of  this  chapter  discusses  a representative  set  of  papers  covering 
related  fields  of  early  vision  research.  Each  is  briefly  summarized  with  a discussion  of  the 
operating  concepts,  advantages,  and  disadvantages. 

The  major  contributions  of  this  thesis  are: 

• Chapter  2 presents  three  architectures  for  edge  detection  in  low-contrast  envi- 
ronments. Results  show  that  a Differenced  Gaussian  architecture  exhibits  superior 
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edge  detection  performance.  Measured  results  from  a fabricated  chip  demonstrate 
a minimum  edge  detection  capability  of  22  millivolts  (mV). 

• Chapter  3 discusses  a Non-Nearest  Neighbor  Differencing  technique  for 
enhancing  edge  signals  and  improving  low-contrast  edge  detection.  Measured 
results  demonstrate  an  edge  detection  capability  in  a Differenced  Gaussian  archi- 
tecture of  9 mV. 

• Chapter  4 compares  the  performance  of  a Lateral  Bipolar  Photoreceptor  (LBP) 
to  the  Original  Logarithmic  Photoreceptor  [5]  in  terms  of  dynamic  range  and  off- 
set. Also,  the  performance  of  two  photodetectors  implemented  in  a 0.8  pm  dou- 
ble-polysilicon  bipolar  process  are  presented.  It  is  shown  that  the  LBP  has  a 
dynamic  range  exceeding  7 orders  of  magnitude  with  random  offset  variations 
below  2 mV  to  one  standard  deviation. 

• Chapter  5 presents  offset  results  from  ten  transamps  implemented  in  a 2 pm 
CMOS  process  provided  through  MOSIS  [6]  and  one  from  a 0.8  pm  double-poly- 
silicon bipolar  process.  It  is  shown  that  random  offset  variations  below  2 mV  to 
one  standard  deviation  can  be  achieved. 

• Chapter  6 presents  results  from  a edge  detection  chip  incorporating  floating- 
gate  buffers  and  comparators  for  improved  low-contrast  performance.  Measured 
results  demonstrate  a capability  to  detect  edge  signals  as  low  as  3 mV. 

• Chapter  7 summarizes  the  topics  covered  in  this  thesis,  draws  conclusions 
based  on  the  results  shown,  and  briefly  discusses  performance  projections  for 
implementations  fabricated  in  modem  processes. 
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Previous  Early  Vision  Efforts 

Researchers  have  previously  developed  and  demonstrated  both  one-  and  two- 
dimensional  systems  performing  a variety  of  functions  ranging  from  image  acquisition 
[4],  [7-11],  to  motion  detection  [12-19],  and  also  to  more  application-specific  functions 
[20-25],  Because  each  system  has  been  tailored  to  a specific  application,  each  has  been 
optimized  for  a certain  set  of  environmental/lighting  conditions.  When  used  in  well-con- 
trolled environments,  the  systems  perform  well.  Designing  systems  in  this  manner,  how- 
ever, results  in  application-specific  integrated  circuits  with  rigid  limitations  and  lacking 
tolerance  for  operating  environment  variations. 

Some  researchers  have  attempted  to  develop  hybrid  analog/digital  systems  with 
greater  functional  flexibility  [26],  By  capitalizing  on  the  strengths  of  both  analog  and  dig- 
ital systems,  these  researchers  have  developed  a programmable  array  processor  for  various 
vision  applications.  Hybrid  systems  of  this  type  may  be  the  genesis  of  future  research 
efforts  by  promoting  compromise  between  functionality,  flexibility,  cost,  and  perfor- 
mance. Areas  that  still  need  addressing  are  increasing  system  tolerance  for  environmental 
lighting  fluctuations,  increasing  dynamic  range,  and  reducing  system  offsets. 

Cellular  neural  networks  (CNN’s)  are  a closely  related  field  to  vision  research. 
CNN’s  place  each  sensor  on  chip  in  close  proximity  to  the  processing  circuitry  [27-29],  In 
this  way,  all  the  processing  is  performed  on  raw  analog  information  at  the  pixel  level 
before  analog-to-digital  conversion  is  performed  thus  reducing  communication  bandwidth 
and  size  [26],  Let  us  now  discuss  some  of  these  efforts  in  order  to  better  understand  the 
origins  and  goals  of  this  field  of  research. 
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Edge  Detection  Systems 

Image  acquisition  is  the  foundation  of  visual  processing.  The  first  step  in  any  sys- 
tem is  to  capture  and  convert  an  image  into  usable  signal  representations  (currents,  volt- 
ages, or  charges).  Subsequent  processing  circuitry  can  then  be  tailored  to  optimally 
process  a given  signal.  Edge  detection  is  one  fundamental  task  addressed  by  many  vision 
architectures  and  this  section  gives  a brief  description  of  several  systems  designed  to  cap- 
ture and  extract  edges  from  raw  image  data. 

Mahowald  and  Mead 

The  “Silicon  Retina”  [4]  developed  by  Mahowald  and  Mead  and  subsequently 
their  “Silicon  Retina  with  Adaptive  Photoreceptors”  [7]  mimics  the  visual  processes 
observed  in  biological  retinae.  In  the  Silicon  Retina,  processing  begins  with  a photorecep- 
tive element  developed  by  Mead  [5]  which  logarithmically  converts  incident  light  energy 
into  voltage  signals.  The  logarithmic  compression  is  useful  to  increase  the  receptor’s 
dynamic  range  allowing  it  to  operate  over  a large  range  of  ambient  lighting  conditions. 
This  is  necessary  since  ambient  lighting  in  ‘real  world’  applications  can  span  twelve 
orders  of  magnitude  [16],  severely  tasking  system  performance. 

Receptor  outputs  are  buffered  through  a transconductance  amplifier  or  transamp,  a 
voltage-to-current  converter,  onto  a resistive  grid  used  for  spatio-temporal  filtering  [30], 
[31],  Edge  enhancement  is  performed  by  subtracting  photoreceptor  outputs  from  spatio- 
temporally  filtered  local  intensities  just  as  is  done  in  biological  retinae.  Each  pixel’s  out- 
put is  then  scanned  off-chip  in  a row/column  manner  similar  to  conventional  Charge  Cou- 
pled Display  (CCD)  systems  [32]  except  the  output  signal  is  a current  instead  of  a voltage. 
The  scanned  output  current  is  then  converted  to  a voltage  off-chip  for  display  on  a conven- 
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tional  monitor.  Strictly  speaking,  however,  the  Silicon  Retina  does  not  detect  edges 
because  it  does  not  threshold  the  output  signals.  The  cell  outputs  are  simply  scanned  off- 
chip  for  display. 

Mahowald  and  Mead’s  “Silicon  Retina  with  Adaptive  Photoreceptors”  [7]  worked 
much  the  same  as  the  original  retina  except  for  the  adaptive  photoelement  which  was  orig- 
inally designed  by  Delbruck  [33],  The  adaptive  receptor  used  in  this  realization  consists 
of  a parasitic  pnp  bipolar  photoreceptor  with  the  emitter  connected  to  an  adjustable  gain 
amplifier  in  a negative  feedback  loop.  Essentially,  the  feedback  circuit  acts  as  a high  pass 
filter  amplifying  high  frequency  changes  in  the  receptor  output  while  also  slowly  adapting 
to  changes  in  ambient  conditions.  The  amplifier  gain  is  controlled  externally  through  a 
biasing  transistor  while  the  feedback  is  controlled  by  selecting  a ratio  of  capacitors  before 
fabrication.  All  other  functions  are  the  same  as  in  the  original  retina.  Each  retina  func- 
tions well  when  operated  in  high-contrast  environments  but  low-contrast  performance  suf- 
fers from  offsets  caused  by  component  mismatch.  Also,  the  adaptive  retina  suffers  from 
an  asymmetrical  response  characteristic  due  to  the  circuit  topology.  This  was  corrected 
later  by  redesigning  the  adaptive  element  [16], 

Boahen  and  Andreou 

Boahen  and  Andreou  [8]  also  designed  a chip  modeling  the  vertebrate  retina’s 
outer-plexiform  layer.  Their  retina  implements  shunting  inhibition  to  enhance  image  con- 
trasts and  it  controls  inter-receptor  coupling  through  global  biasing  to  trade-off  resolution 
for  signal-to-noise  ratio.  Enhancing  the  signal-to-noise  ratio  allows  the  chip  to  operate 
better  in  low-contrast  or  noisy  environments.  Computational  circuitry  is  designed  using 
current-mode  processing  for  compact  silicon  realizations  and  increased  functionality. 
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Processing  begins  with  a parasitic  pnp  bipolar  phototransistor  performing  a light- 
to-current  conversion.  The  current  is  drawn  from  a simple  three  transistor  circuit  con- 
nected in  a negative-feedback  loop  performing  a logarithmic  current-to-voltage  conver- 
sion along  with  localized  adaptation  which  is  externally  controlled.  The  adaptation  rate  is 
controlled  by  two  mechanisms:  first,  global  bias  lines  adjust  the  conductance  of  transis- 
tors modeling  the  reciprocal  synapses  which  connect  adjacent  receptor  cells.  Second, 
another  bias  transistor  adjusts  the  leakage  current  controlling  the  receptors  DC  operating 
point  thereby  modeling  the  horizontal  cell  function. 

Results  from  the  fabricated  chip  demonstrate  the  circuit’s  ability  to  discern  edges 
in  noisy  environments  by  reducing  the  effects  of  offsets.  Mismatches  among  individual 
transistors  still  degraded  performance,  however,  and  the  authors  also  report  that  their 
shunting  inhibition  scheme  enlarged  the  receptive  field  as  the  intensity  increased  thereby 
decreasing  resolution.  In  the  biological  retina,  the  receptive  fields  decrease  with  increas- 
ing intensity  thereby  improving  resolution. 

Bair  and  Koch 

The  system  designed  by  Bair  and  Koch  uses  analog  circuitry  on-chip  with  the  pho- 
toreceptors to  compute  edge  locations  within  a scene  [11].  This  same  design  was  later 
used  for  motion  detection  by  passing  the  edge  information  scanned  off-chip  to  a digital 
computer  [12],  The  computer  tracked  the  edge  movements  and  thereby  determined 
motion.  The  design  incorporates  a one-dimensional  photoreceptor  array,  voltage  buffers 
to  prevent  receptor  loading,  and  two  resistive  networks  for  spatial  filtering.  Comparators 
isolate  edge  locations  by  differencing  the  filtered  signals  thereby  approximating  a Lapla- 
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cian  of  Gaussian  computation.  The  magnitude  of  the  differences  are  then  thresholded  to 
eliminate  false  edge  detections  and  the  results  are  scanned  off-chip. 

The  discrete  resistive  networks  have  a characteristic  response  resembling  a 
Green’s  or  decaying  exponential  function  [30],  It  has  been  shown  that  a Gaussian  filter 
has  many  desirable  properties;  however,  since  this  response  is  difficult  to  produce  in  a 
densely  populated  analog  circuit  the  decaying  exponential  response  is  used  to  approximate 
a Gaussian  function1.  Therefore,  differencing  the  two  filtered  signals  approximates  a Dif- 
ference-of-Gaussian  (DoG)  function.  Advantages  to  using  DoG  networks  have  been  pre- 
viously demonstrated  [35],  The  resistive  networks  in  this  system  have  independently 
adjustable  conductances  for  filtering  control  thereby  allowing  the  comparators  to  compute 
a DoG. 

Chip  results  are  favorable  for  high  contrast  signals.  Problems  arise,  however, 
when  operated  in  low-contrast  environments.  High  offset  levels  reduce  dynamic  range 
and  signal-to-noise  ratio  resulting  in  spurious  edge  detections  in  low-contrast  environ- 
ments. The  authors  attribute  the  offset  problems  to  the  receptor’s  logarithmic  response 
since  it  compresses  edge  signals  in  addition  to  background  signals.  Therefore  the  edge 
signals  are  reduced  to  the  computational  circuit  offset  levels.  Thus  minimum  discernible 
signal  levels  in  low-contrast  environments  are  directly  related  to  the  offset  levels  in  the 
analog  circuits. 


1.  A chip  performing  Gaussian  smoothing  has  been  implemented  previously  but  the 
design  consumed  too  much  area  for  practical  implementations  [34], 
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Motion  Detection  Systems 

Motion  detection/analysis  systems  have  received  a great  deal  of  attention  over  the 
past  decade  due  to  the  variety  of  possible  applications  in  both  academia  and  industry. 
Two-dimensional  motion,  however,  is  a difficult  quantity  to  characterize  within  an  image 
due  to  the  aperture  problem.  Essentially,  the  aperture  problem  can  be  understood  by  con- 
sidering a large  object  passing  by  a small  viewing  window.  As  the  object  is  passing,  one 
can’t  be  sure  they  are  seeing  the  entire  object  moving  past  the  viewing  window.  Thus  it  is 
difficult  to  distinguish  the  true  direction  of  travel  or  completely  identify  the  objects  fea- 
tures [36],  Mathematically,  only  the  component  perpendicular  to  the  edge  contour  can  be 
computed  using  local  information.  This  section  discusses  a number  of  motion  detection 
systems  highlighting  strengths  and  weaknesses  of  each  system. 

Tanner  and  Mead 

This  “Optical  Motion  Sensor”  [13]  is  composed  of  a two-dimensional  photo-array 
with  a photo-conversion  and  motion  detection  circuit  in  each  cell.  To  determine  motion, 
the  authors  subject  the  image  to  a set  of  constraints  to  determine  a global  velocity  esti- 
mate. Outputs  from  each  element  are  compared  with  a global  response  signal  to  produce  a 
correction  factor.  Essentially,  localized  image  information  is  combined  to  produce  a glo- 
bal estimate  of  image  velocity  instead  of  producing  local  estimates  at  each  pixel.  One 
constraint  imposed  is  that  all  features  within  an  image  are  assumed  to  belong  to  a single 
moving  object.  Therefore,  movement  of  features  in  the  image  plane  causes  each  element 
to  produce  a global  correction  factor  which  is  combined  with  the  other  cell  outputs  to  form 
the  global  estimate. 
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Each  cell  consists  of  a photoreceptor  and  computational  circuit.  The  computa- 
tional circuitry  consists  of  four  analog  multipliers,  a summing  circuit,  and  several  other 
support  components.  Altogether  it  is  a very  complicated  design  which  is  prone  to  offsets. 
Although  not  stated,  offsets  in  the  multipliers  and  current  mirrors  likely  dominate  perfor- 
mance at  the  lower  end  of  the  dynamic  range. 

Moore  and  Koch 

In  this  design,  the  authors  detect  motion  using  a multiplication/integration  scheme 
for  tracking  edge  movements  with  time  [14].  Processing  begins  with  the  photoreceptors 
driving  both  a set  of  multipliers  and  an  analog  time  delay  circuit.  The  time  delay  outputs 
are  sent  to  the  multipliers  as  secondary  inputs  allowing  the  multiplier  to  distinguish  tem- 
poral signal  variations.  The  output  is  a combination  of  the  two  multiplier  currents  which 
produce  temporal  and  spatial  derivatives  thereby  yielding  a velocity  estimate  which  is  fed 
off-chip. 

Performance  results  illustrating  responses  to  various  background  illuminations  are 
presented.  The  AC  and  DC  characteristics  varied  under  the  conditions  tested.  The  AC 
response  indicates  the  presence  of  edges  and  the  DC  response  is  a function  of  the  back- 
ground lighting.  As  the  ambient  lighting  increases,  the  DC  operating  point  shifts  lower  in 
voltage  while  the  magnitude  of  the  AC  response  increases  implying  a higher  gain  at  higher 
ambient  intensities.  These  characteristics  can  be  traced  to  the  photoreceptors  which  again 
are  the  logarithmic  receptors  developed  by  Mead  [5],  These  receptors  are  discussed  fur- 
ther in  Chapter  4.  The  AC  response  variations  introduce  errors  into  the  velocity  measure- 
ments by  corrupting  the  integration  function  since  edges  in  low  lighting  environments  do 
not  appear  as  large  as  those  in  high  lighting  environments.  Possible  solutions  are  to  rede- 
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sign  the  processing  circuits  or  substitute  photoreceptors  with  a fixed  DC  operating  point 
which  doesn’t  change  with  ambient  lighting  conditions.  Another  possibility  is  to  use 
adaptive  receptors  [16], 

Horiuchi.  Bair.  Bishofberger.  Moore.  Koch.  Lazzaro 

The  article,  “Computing  Motion  Using  Analog  VLSI  Vision  Chips:  An  Experi- 
mental Comparison  Among  Different  Approaches”  [15],  discusses  two  chips  which  com- 
pute motion  from  on-chip  imaging  arrays.  For  details  on  the  second  chip  see  “Moore  and 
Koch”  on  page  10.  The  first  chip  uses  a temporal  correlation  scheme  comparing  adjacent 
photoreceptor  outputs  from  a 1-D  receptor  array  to  estimate  image  motion.  The  receptors 
first  capture  image  information  and  transform  it  into  voltages.  The  receptors  are  both 
time-adaptive,  to  increase  the  dynamic  range,  and  also  sensitive  to  small  changes  in  inten- 
sity, to  improve  edge  detection  in  low-contrast  environments  [33],  The  receptors  indicate 
an  edge  by  producing  an  output  pulse  when  a feature  possessing  sufficient  contrast  moves 
across  the  receptor  array.  The  edge  pulses  then  propagate  down  delay  lines  possessing 
programmable  time  delays.  Correlators  connected  between  adjacent  delay  line  elements 
then  compare  outputs  looking  for  edge  signals  so  that  features  moving  within  the  proper 
velocity  range  trigger  edge  outputs.  When  a correlator  detects  an  edge  on  both  delay  lines 
simultaneously  it  charges  an  output  line.  Since  the  time  delay  of  each  stage  is  programma- 
ble and  known  a priori,  velocities  are  determined  based  on  the  location  of  the  activated 
correlator.  Correlator  outputs  are  fed  to  a winner-take-all  circuit  which  subsequently  com- 
putes the  strongest  or  ‘winning’  edge  signal  corresponding  to  the  velocity  estimate. 

Performance  characteristics  show  favorable  results  for  high  contrast  edge  signals. 
Low-contrast  signals,  however,  are  a problem  due  to  noise  amplification  in  the  photode- 
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tection  circuitry.  Also,  even  though  the  detectable  velocity  range  is  programmable,  the 
range  is  bounded  at  any  one  time  by  the  delay  settings  and  the  number  of  elements  in  the 
delay  chain.  Therefore  velocities  outside  the  programmed  range  cannot  be  characterized. 
Another  factor  contributing  to  velocity  errors  is  mismatch  in  the  delay  components  result- 
ing in  correlation  location  errors.  Finally,  due  to  the  correlator  design,  motion  detection  is 
unidirectional  which  can  be  either  a benefit  or  a limitation  based  on  the  application. 

Delbruck 

In  his  dissertation  [16],  Delbruck  presents  a two-dimensional  correlation  based 
motion  detector.  Motion  is  detected  using  spatial  correlators  similar  to  those  described  in 
“Horiuchi,  Bair,  Bishofberger,  Moore,  Koch,  Lazzaro”  on  page  11.  As  in  the  previous 
correlator,  motion  detection  is  again  uni-directional.  Using  a biologically  motivated  algo- 
rithm, Delbruck  implements  a ‘null-inhibition’  scheme  to  mask  signals  propagating  in  the 
opposite  direction  to  desired  motion.  The  photoreceptors  are  also  time-adaptive  [33]  to 
improve  low-contrast  performance.  Adaptation  increases  the  receptors  dynamic  range  by 
responding  quickly  to  high  frequency  signals  while  slowly  adjusting  the  DC  operating 
point  to  reflect  ambient  lighting  changes  thereby  maintaining  near  optimum  response. 

Detailed  testing  involved  two  test  cases:  first  a set  of  parallel  lines  moving  in  the 
same  direction  and  at  the  same  velocity  followed  by  a second  rotating  spiral  pattern  were 
focused  on  the  array.  Captured  output  images  illustrate  chip  performance.  Responses 
from  the  parallel  line  test  show  the  chip  responding  well  to  uni-directional  motion.  In 
addition,  responses  to  the  spiral  test  pattern  show  favorable  results  for  both  uni-directional 


motion  detection  and  also  null-inhibition. 
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Several  design  shortcomings  are  that  detectable  velocities  are  subject  to  the  recep- 
tor adaptation  time  again  limiting  the  dynamic  range.  For  example,  if  the  observed  motion 
is  slower  than  the  adaptation  rate  then  motion  could  be  missed  altogether  or  could  produce 
erroneous  motion  signals.  Second,  motion  in  low-contrast  scenes  may  be  missed  due  to 
the  receptor  effectively  nulling  out  slow  moving  edge  signals.  Details  were  not  given  as  to 
the  illumination  or  contrast  levels  used  for  these  tests. 

Application  Specific  Early  Vision  Systems 

Many  vision  systems  do  not  fall  into  the  categories  of  image  acquisition  or  motion 
detection  and  are  therefore  grouped  here.  These  neuromorphic  vision  systems  are 
designed  as  application  specific  integrated  circuits  (ASIC’s)  whose  tasks  vary  from  deter- 
mining object  position  to  automatic  alignment.  For  brevity  this  section  only  discusses 
four  previously  reported  systems. 

Wyatt.  Keast.  Seidel.  Standlev.  Horn.  Knight.  Sodini,  Lee,  and  Poggio 
The  paper  “Analog  VLSI  Systems  for  Image  Acquisition  and  Fast  Early  Vision 
Processing,”  [9]  describes  three  systems  developed  at  the  Massachusetts  Institute  of  Tech- 
nology for  visual  processing  applications.  It  is  a survey  analyzing  several  different  tech- 
niques for  image  processing,  namely  continuous -time  analog  signal  processing,  discrete- 
time processing  using  charge-coupled  devices  (CCD),  and  discrete-time  processing  in  a 
switched-capacitor  network.  Each  system  begins  with  similar  image  acquisition  circuitry 
but  each  processes  the  subsequent  information  differently.  The  first  architecture  is  an 
object  orientation  and  position  chip  designed  by  D.  L.  Standley  [20]  and  is  discussed  later. 
The  second  chip,  designed  by  Keast  and  Sodini,  implements  a focal  plane  processor  for 
image  acquisition,  smoothing,  and  segregation.  CCD’s  capture  the  image  which  is  subse- 
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quently  processed  through  a hybrid  combination  of  digital  and  analog  CMOS  circuitry. 
The  smoothing  and  segmentation  computations  are  performed  simultaneously  with 
smoothing  accomplished  through  charge  sharing  and  segmentation  accomplished  by 
thresholding. 

The  third  chip,  designed  by  Seidel,  Knight,  and  Wyatt,  performs  a depth/slope 
computation  for  edge  detection  using  switched-capacitor  circuits.  This  work  is  an  exten- 
sion of  Harris’  research  investigating  this  topic  [37],  The  fundamental  concept  is  to  mini- 
mize an  energy  function  by  penalizing  deviations  from  a given  smoothness  function.  The 
minimizing  functions  are  implemented  through  interconnected,  redundant  switched- 
capacitor  networks.  Results  from  simulations  show  great  promise  for  extracting  image 
details. 

These  two  chips  demonstrate  the  flexibility  inherent  in  neuromorphic  processing. 
Images  are  captured  by  a number  of  different  means  and  are  represented  by  different  sig- 
nal types,  i.e.  voltages,  currents,  and  charges.  The  signals  are  then  processed  in  real-time 
using  analog,  digital,  and  hybrid  analog/digital  circuitry.  Although  not  specifically  stated 
in  the  papers,  offsets  likely  set  the  fundamental  lower  limits  on  processing  accuracy  and 
dynamic  range  for  all  these  systems.  In  the  focal  plane  processor,  offsets  in  the  sense 
amplifier  used  for  computing  edge  locations  will  determine  the  minimum  edge  thresholds 
since  settings  below  these  levels  will  result  in  false  edge  detections.  Similarly  in  the 
depth/slope  network,  capacitor  mismatches  and  errors  from  clock  feedthrough  in  the 
switching  circuitry  will  determine  the  lower  end  of  the  dynamic  range.  Also,  offsets  in  the 
receptive  elements  will  degrade  performance. 
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Standley 

David  Standley  designed  a chip  for  computing  object  position  and  orientation  from 


images  focused  on  chip  [20],  Position  and  orientation  are  determined  from  the  objects 
first  and  second  moments  which  are  used  to  compute  the  centroid  and  axis  of  least  inertia 
thereby  describing  position  and  orientation.  The  moments  are  processed  by  gathering 
information  from  both  a resistive  grid  and  a set  of  linear  and  nonlinear  resistive  lines.  The 
chip  produces  eight  output  currents  containing  the  orientation  and  position  information 
needed  to  solve  the  double  integrals  in  the  moment  computations. 

The  chip  contains  a 29  x 29  photoreceptor  array  connected  to  a 30  x 30  linear  resis- 
tor array.  Raw  image  signals  are  first  thresholded  to  remove  any  background  lighting  or 
noise  variations.  Additionally  since  the  computations  are  performed  off-chip,  the  receptor 
offsets  have  been  measured  and  stored  for  calibrating  the  raw  chip  data  before  computing 
the  moments  reducing  offset  induced  errors.  Each  photocell  produces  a current  which  is 
injected  into  the  resistive  array.  The  array  currents  are  then  applied  to  either  a uniform  or 
a quadratic  resistive  line  which  weights  the  currents  according  to  their  location.  Each  cur- 
rent is  then  buffered  off-chip  where  it  is  measured  and  the  moments  are  subsequently  pro- 
cessed on  a computer.  Compressing  the  information  on-chip  into  only  eight  currents 
allows  the  chip  to  be  effectively  sampled  at  5000  frames/sec. 

Chip  results  show  that  position  and  orientation  can  be  determined  within  several 
degrees  for  high-contrast  images  of  sufficient  size  and  elongated  dimension.  Circuit  off- 
sets again  limit  performance  to  high  contrast  images,  and  in  this  case,  the  images  must  be 
sufficiently  elongated  to  determine  orientation  accurately.  The  orientation  results  are  sig- 
nificantly improved  by  sampling  the  chip’s  offsets  prior  to  use  and  compensating  off-chip. 
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DeWeerth 

In  a topic  similar  to  Standley’s,  Stephen  DeWeerth  designed  several  chips  that  per- 
form stimulus  localization  and  centroid  computation  [21].  In  his  first  chip,  DeWeerth 
implemented  a one-dimensional  processing  array  for  stimulus/centroid  computation  using 
current  integration.  Photodiodes  acquire  an  image  and  supply  an  illumination  relative  bias 
current  to  a one-dimensional  array  of  differential  pairs.  The  differential  pair  inputs  consist 
of  a position  encoded  reference  signal  and  a global  feedback  voltage  which  has  encoded 
the  stimulus  position  along  the  array.  The  differential  pair  output  currents  are  differenced 
using  a current  mirror  and  the  resultant  signal  is  fed  back  as  the  reference  signal.  In  order 
to  minimize  offsets  and  improve  resolution,  the  differential  pairs  are  realized  using  verti- 
cal npn  bipolar  transistors  instead  of  MOSFETs  in  a CMOS  process. 

In  the  second  chip,  DeWeerth  implemented  a two-dimensional  version  of  the  one- 
dimensional chip.  Test  results  were  reported  for  both  chips  using  a bright  stimulus  on  a 
dark  background.  Results  showed  the  chips  performing  the  centroid  computation  and 
localization  operations.  Several  system  drawbacks  were  evident  from  the  tests.  First,  the 
circuits  required  high  contrast  images  for  reliable  operation  implying  that  offsets  are  again 
a problem  and  second,  they  are  unable  to  distinguish  individual  stimuli  when  multiple 
stimuli  are  present. 

Umminger  and  Sodini 

Umminger  and  Sodini  addressed  the  topic  of  automatic  sensor  alignment  [22], 
Citing  examples  from  industry  which  note  a desire  for  fast,  precise  alignment  tools,  they 
designed  and  fabricated  a ‘smart’  analog  sensor  for  automatic  alignment.  Alignment 
begins  when  a surveyors  mark  is  focused  on  the  aligning  sensor.  The  sensor  breaks  the 
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alignment  process  into  two  steps;  the  first  is  a rough  alignment  phase  where  the  sensor  is 
sectioned  into  four  equal  sized  quadrants.  Photodiode  currents  relating  illumination  levels 
in  each  quadrant  are  processed  to  equalize  the  four  currents.  Processing  in  this  stage  is  a 
brute  force  computation  on  the  diode  currents  along  with  normalization  to  eliminate  back- 
ground illumination  dependencies. 

As  the  course  alignment  process  proceeds  and  the  surveyors  mark  comes  into 
alignment,  processing  enters  the  second  or  fine  alignment  phase.  In  this  phase,  the  origi- 
nal diodes  are  broken  into  four  sets  of  three  photodiodes.  To  minimize  errors  resulting 
from  photodiode  mismatch,  the  fine  alignment  process  uses  edge  detection  versus  a cur- 
rent magnitude  comparison.  Currents  from  the  twelve  diodes  are  analyzed  and  corrections 
are  produced  to  minimize  total  current  flow  thereby  aligning  the  mark.  The  fine  alignment 
circuitry  is  a bit  more  sophisticated  than  the  coarse  alignment  processing  circuitry  with 
three  stages  of  current  amplification  and  normalization.  Normalization  is  again  necessary 
to  remove  background  illumination  dependencies.  Chip  results  demonstrate  a minimum 
repeatability  of  53  parts  per  million  while  consuming  only  4 milliwatts. 

Brajovic  and  Kanade 

The  last  application  specific  neuromorphic  processor  that  is  discussed  was 
designed  by  Brajovic  and  Kanade  and  is  “A  Sorting  Computational  Sensor”  [23],  The 
system  captures  images  and  processes  data  based  upon  a temporal  correlation  paradigm 
consisting  of  two  processing  levels.  The  first  comprises  local  processors  encoding  image 
data  at  the  pixel  level.  The  second  contains  a global  processor  which  can  be  tailored  to  a 
variety  of  tasks.  The  processing  paradigm  is  based  on  the  mammalian  visual  system, 
encoding  or  prioritizing  visual  information  based  on  intensity  which  yields  decreased  pro- 
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cessing  latency  and  communication  bandwidth.  Processing  latency  is  decreased  since 
computations  are  performed  continuously  on  the  most  recent  data  and  bandwidth  is 
decreased  because  individual  cells  only  report  information  to  the  global  processor  when  a 
‘significant  event’  occurs.  In  this  case,  the  ‘significant  event’  is  classified  as  the  pixel 
intensity  exceeding  a predetermined  threshold. 

The  local  processor  operates  by  charging  a capacitor,  in  this  case  an  inverter’s  par- 
asitic input  capacitance.  Simultaneously,  the  capacitor  is  discharged  by  a photodiode  at  a 
rate  relative  to  the  incident  illumination.  When  the  voltage  level  drops  below  the 
inverter’s  threshold,  the  inverter  changes  state  latching  the  photocells  array  coordinates. 
The  coordinates  are  then  transmitted  to  the  global  processor  which  catalogs  in  time  the 
arrival.  The  temporal  correlation  relates  each  cells  intensity  level  to  the  rate  of  discharge 
from  the  photodiode.  Therefore  photocells  containing  significant  information,  ones 
encountering  higher  illumination  levels,  are  processed  before  those  with  little  or  no  signif- 
icant information.  Thus  enabling  the  global  processor  to  take  on  higher  level  processing 
tasks  such  as  histogram  computation,  equalization,  point-to-point  mapping,  or  segmenta- 
tion; whatever  the  application  requires. 

There  are  two  problems  with  this  design.  First,  the  response  time  is  directly  pro- 
portional to  ambient  illumination  resulting  in  slower  response  times  in  low  lighting  envi- 
ronments which  is  true  of  most  continuous -time  receptor  systems.  Second,  the  photo- 
array is  a time-sampled  system.  Instead  of  continuously  processing  all  photocell  outputs, 
the  cells  are  sampled  when  the  receptor  exceeds  the  predetermined  threshold  and  then  is 
idle  until  the  global  reset  signal  is  issued  and  when  the  next  image  is  processed.  However, 
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the  fundamental  design  goals  of  bandwidth  reduction  and  data  prioritization  are  demon- 
strated in  the  chip  results. 

Literature  Summary 

While  the  examples  cited  are  by  no  means  exhaustive,  the  reader  can  begin  to 
grasp  the  variety  of  potential  applications  amenable  to  neuromorphic  processing.  Each 
system  addresses  a set  of  design  trade-offs  to  perform  an  intended  function  within  certain 
limitations.  Applications  range  from  industrial  solutions  to  biological  modeling.  One 
limitation  affecting  all  applications,  though,  is  offsets.  Offsets  limit  dynamic  range, 
reduce  resolution,  increase  error  rates,  and  require  larger  silicon  realizations.  To  compen- 
sate for  the  offsets,  designers  have  either  ignored  the  limitations  and  operated  their  sys- 
tems under  conditions  which  render  the  offsets  negligible  or  they  increased  their  layout 
size  to  try  to  reduce  the  offsets.  The  first  method,  of  course,  is  no  solution  but  designers 
chose  to  operate  their  devices  under  these  conditions  to  prove  their  fundamental  design 
philosophies.  The  second  method  is  a conventional  yet  brute  force  solution  which  has 
been  used  in  analog  design  for  years. 

Objectives 

This  dissertation  focuses  on  issues  relating  to  the  design  of  low-contrast,  edge 
detection  systems.  Robust  edge  detection  in  low-contrast  environments  is  an  essential 
task  of  most  early  vision  systems.  Chapter  2 compares  performance  characteristics  of 
three  edge  detection  architectures  for  low-contrast  operation  detailing  the  trade-offs  of 
each.  Chapter  3 presents  a Non-Nearest  Neighbor  Differencing  technique  to  improve  sig- 
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nal  retention  in  low-contrast  environments.  Chapter  4 discusses  several  continuous-time, 
logarithmic  photoreceptors  contrasting  the  signal-to-offset  performance  of  each.  Chapter 
5 presents  offset  characteristics  from  ten  transamp  realizations  quantizing  offset  improve- 
ments and  methodologies.  Chapter  6 presents  results  from  an  edge  detection  architecture 
incorporating  floating-gate  devices  for  improved  low-contrast  performance.  Finally, 
Chapter  7 presents  performance  projections  to  modem  processes  and  future  directions  for 
this  research. 


CHAPTER  2 

A COMPARISON  OF  THREE  EDGE  DETECTION  ARCHITECTURES  FOR  LOW- 

CONTRAST  VISION  SYSTEMS 

Design  Goals 

As  can  be  seen  from  the  reviews  in  Chapter  1 , low -contrast  performance  in  early 
vision  systems  is  fundamentally  limited  by  offsets.  Therefore  this  thesis  investigates  cir- 
cuit designs  and  algorithmic  techniques  for  reducing  offsets  and  improving  low-contrast 
performance.  Additionally,  designs  will  emphasize  solutions  yielding  high-resolution  and 
low-power  consumption.  To  achieve  these  goals,  subthreshold  analog  circuitry  is  used 
throughout  to  reduce  power  consumption  and  size  while  increasing  computational  capa- 
bility [1],  [36],  [38], 

A fundamental  task  in  vision  processing  is  feature  detection.  Features  consist  of 
edges,  objects,  colors,  textures,  surfaces,  etc.  This  work  concentrates  on  edge  detection 
since  many  applications  are  based  on  processing  edge  information  as  exhibited  by  the  sys- 
tems discussed  in  Chapter  1.  As  examples,  motion  is  computed  by  tracking  edge  move- 
ments in  “Bair  and  Koch”  on  page  7,  “Moore  and  Koch”  on  page  10,  and  “Horiuchi,  Bair, 
Bishofberger,  Moore,  Koch,  Lazzaro”  on  page  1 1 . Therefore  paramount  among  process- 
ing tasks  is  robust  edge  detection.  This  chapter  discusses  three  processing  architectures 
for  edge  detection.  The  first  is  based  on  a Laplacian  of  Gaussian  algorithm,  the  second  is 
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based  on  a Difference  of  Gaussians  algorithm,  and  the  third  is  based  on  a Differenced 
Gaussian  algorithm.  Portions  of  this  chapter  have  previously  been  published  [39], 

Edge  Detection  Architectures 

The  resolution  and  performance  of  an  analog  VLSI  computational  circuit  can  be 
characterized  by  its  signal-to-noise  ratio  [40],  Therefore  it  is  crucial  to  address  the  com- 
ponents of  this  metric  in  every  stage  of  development  to  optimize  performance.  Designs 
must  minimize  losses  from  filtering  or  inefficient  architectural  realizations  to  efficiently 
utilize  available  signals.  Feature  information  becomes  more  difficult  to  retain  as  practical 
implementation  issues  such  as  offsets,  noise,  and  finite  gain  are  incorporated  since  these 
signals  increase  noise  figures  and  thereby  increase  filtering  requirements.  In  analog  com- 
putational architectures,  offsets  are  a more  fundamental  limiting  factor  than  traditional 
analog  noise  sources  such  as  white  or  flicker  noise  due  to  the  respective  noise  magnitude 
contributions.  Offset  levels  between  adjacent  circuits  are  shown  to  be  several  millivolts  to 
one  standard  deviation  while  the  more  traditional  noise  sources  contribute  much  lower 
signal  levels  due  to  the  use  of  large  transistors.  In  addition,  the  bandwidths  of  computa- 
tional circuits  is  typically  several  kilohertz  to  several  tens  of  kilohertz  which  reduces  noise 
levels  induced  in  the  computational  circuitry.  In  this  chapter  offsets,  noise,  and  finite  gain 
are  lumped  into  a single  parameter  and  referred  to  simply  as  noise.  Therefore,  edge  detec- 
tion algorithms  incorporated  in  silicon  realizations  must  retain  the  available  signal 
strength  while  minimizing  noise  contributions. 
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In  order  to  better  understand  the  three  architectures  presented  in  this  chapter,  let  us 
first  define  several  relations  common  to  each.  First,  the  function  Dj  is  the  discrete  magni- 
tude difference  between  adjacent  photo-cell  outputs  as  shown  in  equation  (2-1). 


Also  in  these  three  architectures,  the  signal  representations  up  to  the  signal  labeled  Sj  in 
Figure  2-2,  Figure  2-5,  and  Figure  2-7  are  unique  and  will  be  defined  in  the  appropriate 
sections.  Processing  after  S,  is  functionally  identical  and  the  equation  representing  this 
thresholding  step  is 


where  O,  is  the  discrete,  binary  output  signal  indicating  an  edge  location,  Vth  is  the  thresh- 
old voltage,  and  Dj  is  the  discrete  spatial  derivative  of  Sj  defined  in  equation  (2-1). 

The  Laplacian  of  Gaussian  (LoG)  and  Difference  of  Gaussian  (DoG)  architectures 
use  a combination  of  zero  crossings,  determined  by  a second  derivative  computation,  and 
thresholding  to  determine  edge  locations.  The  Differenced  Gaussian  (DG)  architecture 
computes  a first  derivative  and  thresholds  the  resultant  for  edge  detection.  The  critical 
computation  in  all  three  designs  is  thresholding  which  separates  edges  from  noise.  In 
these  designs,  Gaussian  functions  are  used  to  approximate  the  symmetric,  decaying  expo- 
nential responses  of  the  HRes  [30]  filtering  networks  implemented  in  the  silicon  realiza- 
tions for  ease  of  mathematical1  representation.  A plot  of  the  normalized  Gaussian  and 
decaying  exponential  responses  are  shown  in  Figure  2-1.  The  characteristic  length  used  to 
compute  both  responses  was  a = L = 5 where  a is  the  standard  deviation  of  the  Gauss- 
ian function,  defined  in  this  context  as  the  functions  characteristic  length,  and  L is  the 


D:  = S.-S 


i + 1 


(2-1) 


(2-2) 


1 Simulations  using  the  actual  exponential  filter  response  characteristics  have  shown 
similar  results  to  using  Gaussians. 
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Figure  2-1.  Plot  of  the  normalized  Gaussian  and  decaying  exponential  smoothing 
functions.  Each  plot  shows  how  an  input  at  the  center  of  a 1-D  array  is  shared  or 
spread  among  neighboring  pixels.  In  this  example,  the  characteristic  length  ct=L=5. 


characteristic  length  of  the  HRes  function  or  the  value  at  which  the  exponential  response 
has  decreased  by  a factor  of  e.  Filtering  decreases  the  offsets  presented  to  the  edge  detec- 
tion circuits  which  thereby  reduces  the  threshold  magnitude  requirements.  Thus  allowing 
the  edge  detection  circuits  to  detect  smaller  edge  signals. 

The  remainder  of  this  chapter  is  organized  as  follows:  first,  a Laplacian  of  Gauss- 
ian architecture  is  presented  which  computes  thresholded  zero-crossings  for  edge  detec- 
tion. Next,  a Difference  of  Gaussian  architecture  is  presented  [12]  which  also  computes  a 
thresholded  zero-crossing  for  edge  detection.  Lastly,  a discrete  Differenced  Gaussian 
architecture  is  presented  which  computes  a thresholded  peak  detection  for  edge  isolation 
[39],  Following  these  discussions,  a comparison  is  presented  detailing  signal  retention 
characteristics  of  each  architecture.  In  addition,  response  characteristics  from  a chip 
implementing  the  DG  architecture  are  presented.  The  architectural  discussions  assume 
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Figure  2-2.  LoG  Architecture.  This  architecture  computes  the  discrete  second 
derivative  of  a one-dimensional  Gaussian  filtering  network.  The  computation  is 
performed  by  the  two  levels  of  transamps  acting  as  spatial  differentiators. 


ideal,  noiseless  circuit  components  neglecting  practical  implementation  issues  such  as 
finite  gain  and  offset. 

Laplacian  of  Gaussian  Algorithm 

The  first  architecture  implements  a Laplacian  of  Gaussian  function  or  more  specif- 
ically it  computes  the  second  spatial  derivative  of  the  input.  Figure  2-2  shows  a one- 
dimensional schematic  representation  where  photoreceptor  outputs  are  buffered  by  tran- 
samps (voltage-to-current  converters)  to  prevent  receptor  loading  onto  a one-dimensional 
HRes  resistive  network  [30],  The  characteristic  length  of  the  resistive  network  is  con- 
trolled by  a bias  voltage  while  the  first  and  second  discrete  spatial  derivatives  are  com- 
puted by  two  layers  of  transamps  acting  as  differentiators  or  spatial  comparators.  The  first 
derivative  produces  a peak  response  at  an  edge  location  while  the  second  derivative  pro- 
duces a zero-crossing  where  an  edge  occurs  in  the  input.  Figure  2-3  shows  a representa- 
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Figure  2-3.  LoG  architecture  response  characteristics  at  each  processing  stage.  A 
clean  input  signal  is  depicted  in  ‘a’  and  a filtered  version  using  a = 5 is  shown  in 
‘b’.  The  peak  response  computed  from  the  first  derivative  of  ‘b’  is  shown  in  ‘c’  and 
‘d’  shows  the  zero-crossing  response  resulting  from  the  second  derivative 
computation.  Note  the  60  mV  decrease  in  available  signal  strengths  from  the  first 
derivative  results  to  the  second  derivative  results,  approximately  80  mV  zero-to-peak 
versus  -10  mV  to  10  mV  peak-to-peak. 


tion  of  the  resultant  signal  at  each  processing  stage.  A 1 V step  input  signal  is  shown  in 
Figure  2-3a  representing  an  ideal  photoreceptor  input  signal.  Figure  2-3b  shows  a filtered 
version  corresponding  to  the  HRes  outputs.  The  filtered  signal  is  produced  by  convolving 
the  step  input  with  a Gaussian  function  having  a = 5 . Differentiating  the  filtered  signal 
once  results  in  the  peak  response  shown  in  Figure  2-3c  and  computing  the  second  deriva- 
tive results  in  the  zero-crossing  response  shown  in  Figure  2-3d.  Each  derivative  function 
contains  the  desired  edge  information,  but  the  crucial  point  is  how  the  information  is  pro- 
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Figure  2-4.  LoG  response  to  an  input  signal  consisting  of  a narrow  feature  composed 
of  two  closely  located  edges  forming  a pulse.  Note  the  rapid  signal  loss  as  filtering 
increases  from  a = 1,2.. 16. 


cessed.  This  architecture  isolates  the  zero-crossing  in  Figure  2-3d  to  determine  edge  loca- 
tions. In  digital  image  processing  this  task  is  easily  accomplished  due  to  the  offset  free 
computations.  In  silicon  systems,  however,  these  computations  are  performed  using  com- 
ponents with  finite  gain  and  offsets,  and  as  can  be  seen  Figure  2-3 d,  there  is  less  than  20 
mV  signal  difference  to  isolate  the  zero  crossing. 

It  can  be  shown  that  the  system  step  response  is  described  by 


where  a is  the  characteristic  length  of  the  filtering  network,  A is  the  input  signal  magni- 
tude, and  x is  the  spatial  position.  Locally  differenced  responses,  Sj,  of  the  LoG  architec- 
ture to  a 20  mV  signal  consisting  of  two  closely  located  edges  forming  a pulse  input  are 
shown  in  Figure  2-4.  The  family  of  curves  shown  represent  the  response  change  corre- 
sponding to  characteristic  lengths  of  ct  = 1,2..  16. 
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There  are  three  points  of  interest  within  Figure  2-4.  First,  the  signal  response 
decreases  very  rapidly  as  o increases  implying  that  noisy  systems  requiring  a large 
amount  of  filtering  will  have  trouble  isolating  low-contrast  edges.  Second,  the  maximum 
attainable  signal  difference  for  an  ideal  (noiseless)  system  is  A which  occurs  at  a = 0. 
Third,  this  algorithm  can  extract  single  pixel  width  edge  locations  by  combining  thresh- 
olded  local  differences  with  zero-crossings;  a difficult  task  for  the  succeeding  two  archi- 
tectures. 

One  detractor  of  this  architecture  is  the  complexity.  Multiple  layers  of  processing 
result  in  higher  accumulated  offsets  and  consequently  lower  S/N  ratios.  Also,  the  addi- 
tional circuitry  increases  cell  size  thereby  reducing  resolution  and  increases  power  con- 
sumption. 

Difference  of  Gaussian  Algorithm 

The  Difference  of  Gaussian  [12]  architecture  computes  thresholded  zero-crossings 
by  smoothing  the  input  signal  on  two  Gaussian  filtering  networks  each  having  a different 
characteristic  length.  Then  a pixel-by-pixel  differencing  operation  computes  the  magni- 
tude difference  between  the  two  Gaussian  networks.  It  has  been  shown  [35]  that  a DoG 
function  using  = 1.6g^j  closely  approximates  computing  a LoG  or  second  deriva- 
tive of  the  input  signal.  A DoG  architecture,  however,  has  the  advantage  of  greater  com- 
putational stability  in  an  analog  network  since  second  derivatives  computed  by  open-loop 
transamps  are  very  susceptible  to  offsets.  Figure  2-5  shows  five  cells  from  a one-dimen- 
sional Difference-of-Gaussian  array.  Receptor  outputs  are  buffered  through  transamps 
onto  separate  HRes  networks  to  prevent  loading.  The  characteristic  length  [30]  of  each 


29 


Logarithmic 

Photoreceptors 


Trans  amp 
Buffers 


HRes  Resistive 
Networks 


Wide  Range 
Output  Buffers 


Figure  2-5.  DoG  Architecture.  Five  adjacent  photo-processing  cells  from  the  zero- 
crossing chip  produced  by  Bair  [12].  Photoreceptor  outputs  are  buffered,  filtered, 
and  a DoG  operation  is  performed.  Edge  signals  are  determined  by  thresholding 
locally  differenced  outputs. 


network  is  independently  controlled  by  off-chip  bias  voltages.  Edge  locations  are  com- 
puted by  thresholding  nearest-neighbor  differences  as  shown  in  equation  (2-2). 

The  step  response  of  an  ideal  realization  of  the  system  in  Figure  2-5  is  shown  in 
equation  (2-4) 


5 -d 

i ~ 2 


erf 


\2aF\J 


-erf 


\2(5F2J 


(2-4) 


where  ‘A’  denotes  the  input  signal  magnitude  of  the  step  function,  aFl  and  aF2  denote 
the  characteristic  lengths  of  the  respective  filter  networks,  and  x represents  the  spatial 
position. 

Figure  2-6  shows  the  locally  differenced  values  from  equation  (2-4)  when  an  edge 
input  pulse  of  A = 20  mV  is  applied  to  a system  with  characteristic  lengths 
— 1.6Gpj  and  aF\  ~ 1>2>  .16  . The  arrow  indicates  the  response  change  as  a Fl 
is  increased.  In  Figure  2-6,  the  maximum  attainable  signal  difference  is  approximately  3 
mV  and  occurs  when  = 1 or  when  minimum  filtering  is  applied.  The  theoretical 
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Nearest  Neighbor  Differences,  Dj,  From  DoG  Chip 
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Figure  2-6.  A plot  of  the  discrete  nearest  neighbor  differences  function  resulting 
from  the  DoG  architecture  when  crF2  = 1.6aF,  and  ctf1  = 1,2..  16  . The  peak 
nearest  neighbor  signal  difference  is  approximately  3 mV. 


maximum  signal  difference  for  an  ideal  system  of  infinite  length  is  ‘ A’ . This  occurs  when 
there  is  no  filtering  in  one  resistive  network,  resulting  in  a sharp  transition  at  the  step 
input,  and  an  infinite  amount  of  filtering  in  the  second  network  resulting  in  a dc  average  of 
half  the  input  signal  magnitude,  A/2.  As  in  the  LoG  architecture,  the  architectural  com- 
plexity results  in  large  offsets  and  small  S/N  ratios. 

An  interesting  difference  between  the  DoG  response  in  Figure  2-6  and  the  LoG 
response  in  Figure  2-4  is  the  LoG  response  is  more  spatially  compact  than  the  DoG 
response  for  the  filtering  constants  chosen.  The  reason  is  that  an  optimal  fit  between  the 
Difiference-of-Gaussian  response  to  the  Laplacian-of-Gaussian  function  has  not  been  per- 
formed since  the  goal  of  this  discussion  is  to  determine  which  architecture  retains  the  most 
signal  under  realistic  filtering  requirements. 
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Figure  2-7.  Differenced  Gaussian  Architecture.  Five  processing  cells  from  a chip 
implementing  a locally  Differenced  Gaussian  algorithm  for  edge  localization. 


Differenced  Gaussian  Algorithm 

The  Differenced  Gaussian  (DG)  architecture  computes  spatial  differences  from  a 
single  filtered  version  of  the  input  to  isolate  edge  locations  as  depicted  in  Figure  2-7. 
Again  several  processing  cells  are  depicted,  each  is  composed  of  a logarithmic  photore- 
ceptor, transamp  buffer,  an  HRes  filtering  network,  and  transamp  comparators  for  comput- 
ing edge  locations.  It  can  be  shown  [41]  that  the  expression  representing  an  ideal  response 
to  a step  input  is 


where  ‘A’  is  the  step  input  signal  magnitude,  a is  the  filters  characteristic  length,  and  x is 
the  spatial  position.  Figure  2-8  shows  the  differenced  results  when  a 20  mV  step  is  intro- 
duced into  the  system  while  the  filters  characteristic  length  is  increased  from  1 to  16.  The 
peak  difference  occurs  at  a = 1 and  is  approximately  8 mV.  In  this  algorithm,  the  max- 
imum difference  between  adjacent  points  for  an  ideal  infinite  length  system  is  ‘A’,  which 
occurs  when  a = 0 , yielding  the  step  transition  between  adjacent  pixels. 
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Nearest  Neighbor  Differenced  Gaussian  Results,  D, 


Figure  2-8.  Differenced  Gaussian  Results.  Nearest  neighbor  differencing  results 
showing  the  peak  signal  magnitude  as  a is  increased  from  1 to  16. 

Several  notes  on  the  DG  algorithm:  first,  it  can  produce  thick  edges  since  a simple 
magnitude  comparison  is  being  performed  which  can  reduce  resolution  and  require  addi- 
tional post-processing  for  edge  thinning.  Second,  it  is  the  least  complex  of  the  three  archi- 
tectures which  reduces  offsets  and  improves  the  S/N  ratio.  Lastly,  results  from  a chip 
implementing  this  architecture  are  presented  later  in  this  chapter. 

Comparison 

The  maximum  recoverable  signal  for  each  algorithm  in  an  ideal  situation  is  ‘A’. 
The  signal-to-noise  ratio,  however,  depends  on  the  amount  of  filtering  applied  and  the 
noise  or  offset  level.  Figure  2-9  shows  a comparison  of  peak  signal  outputs  from  all  three 
algorithms  corresponding  to  a step  input  of  20  mV.  The  filtering  coefficients  used  are 
°F2  ~ 1 .6cr/r i = 1.6a dg=  ^-6®log  where  aDG  and  oL0G  represent  the  filtering 
constants  used  in  the  Differenced  Gaussian  and  Laplacian  of  Gaussian  algorithms  respec- 
tively while  and  a ^ are  the  DoG  filtering  constants.  The  peak  signal  obtained 
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Architectural  Response  Comparison 


Figure  2-9.  Peak  signal  differences  between  adjacent  points  for  all  algorithms.  For 
the  DoG  algorithm  Of 2 = 1.60^1  is  assumed,  therefore  the  filtering  coefficient 
plotted  along  the  x-axis  refers  to  aF1 . 

from  the  DoG  algorithm  occurs  at  = .7  and  has  the  value  2.25  mV  while  the  Differ- 
enced Gaussian  algorithm  yields  a signal  difference  of  10.4  mV  at  this  same  filtering 
level. 

As  filtering  increases,  all  functions  tend  toward  zero  but  for  characteristic  lengths 
between  1 and  10  the  Differenced  Gaussian  algorithm  provides  superior  signal  retention. 
The  LoG  algorithm  yields  similar  signal  retention  characteristics  to  the  DG  algorithm  at 
low  filtering  constants  but  quickly  loses  this  capability  as  filtering  increases.  Since  the 
DG  algorithm  operates  on  thresholded  differences  from  a first  derivative  computation,  it 
retains  greater  signal  magnitudes  compared  to  the  other  algorithms  due  to  its’  simpler,  first 
derivative  peak  detection  computation.  The  others  localize  zero-crossings  having  slopes, 
computed  through  second  derivatives,  greater  than  some  threshold.  One  consideration  is 
that  extremely  low  or  high  filtering  constants  are  not  practical  for  VLSI  implementations 
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since  filtering  constants  below  1 essentially  perform  no  filtering  while  constants  above  15 
spread  edge  signals  over  an  extremely  large  region  making  detection  difficult. 

From  these  results,  it  is  clear  the  Differenced  Gaussian  algorithm  makes  better  use 
of  the  available  signal  than  the  other  methods.  Therefore,  one  can  conclude  that  in  analog 
VLSI  networks  it  is  intrinsically  easier  to  localize  magnitude  changes  than  slope  changes. 
In  addition,  the  Differenced  Gaussian’s  simpler  computational  structure  results  in  a more 
compact,  lower  power  realization  with  reduced  offsets  and  better  S/N  ratio.  One  tech- 
nique to  improve  low-contrast  performance  in  all  three  architectures  is  Non-Nearest 
Neighbor  Differencing  [41].  This  technique  increases  the  spatial  sampling  distance  used 
in  the  differencing  computations  to  retain  greater  signal  levels.  This  topic  is  further  dis- 
cussed in  Chapter  3. 


Differenced  Gaussian  Chip  Results 

The  Differenced  Gaussian  architecture  has  been  implemented  in  a 2 \xm  Analog 
N-well  CMOS  process  provided  through  MOSIS  [6],  Figure  2-10  shows  a captured  oscil- 
loscope plot  of  the  one-dimensional  network  showing  random  offsets  originating  in  the 
transamp  buffer  circuits.  In  this  plot,  a uniform  input  signal  of  3 V has  been  applied  to  all 
buffer  inputs  simultaneously.  Trace  1 shows  the  frame  sync  signals  indicating  the  begin- 
ning of  the  array  and  Trace  2 shows  the  individual  receptor  cell  outputs.  There  are  twenty- 
seven  receptor  cells  in  the  array;  several  cell  outputs  are  indicated  in  the  figure.  The  off- 
sets associated  with  this  transamp  topology  are  approximately  1.7  mV  to  one  standard 
deviation.  Transamp  offsets  are  further  discussed  in  Chapter  5.  Trace  2 employs  AC  cou- 
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Figure  2-10.  Oscilloscope  plot  of  a one-dimensional  Differenced  Gaussian  network. 
Trace  1 shows  the  frame  sync  pulses  indicating  the  array  beginning  while  Trace  2 
shows  the  array  outputs  when  a uniform  3 V input  signal  is  applied  to  all  buffers 
simultaneously.  Details  for  each  trace  are  shown  at  the  bottom  of  the  figure.  No 
filtering  was  employed  in  this  example. 


pling  and  has  a 5 mV/division  magnitude  resolution  to  better  examine  the  output  offset 
variations. 

Figure  2- 1 1 shows  a captured  oscilloscope  plot  of  the  one-dimensional  Differ- 
enced Gaussian  network  showing  the  frame  syncs  in  Trace  1,  the  filtered  receptor  outputs 
in  Trace  2 using  L = 5,  and  the  differencing  circuitry  outputs  in  Trace  3.  A simulated 
edge  signal  has  been  introduced  in  the  center  of  the  array,  as  shown  in  Trace  2,  to  deter- 
mine the  minimum  detectable  signal  at  the  chosen  bias  conditions.  The  minimum  detect- 
able edge  signal  was  22  mV.  From  Trace  2,  one  can  begin  to  see  the  problem  posed  by 
circuit  offsets.  Offset  differences  between  adjacent  cells  can  corrupt  the  input  signal  by 
increasing  or  decreasing  the  edge  signal.  In  Trace  3,  at  the  end  of  the  array,  the  last  differ- 


36 


Tek  Run:  25.0kS/s  Average 

H-T{ f- * i 3 


Figure  2-11.  Oscilloscope  plot  of  one-dimensional  Differenced  Gaussian  network 
showing  frame  sync  (Trace  1),  receptor  outputs  (Trace  2),  and  differencing  circuitry 
outputs  (Trace  3).  A simulated  edge  signal  of  22  mV  has  been  introduced  between  two 
cells  in  the  center  of  the  array  as  indicated  in  Trace  2 and  a filtering  constant  of  L = 5 
has  been  applied.  Trace  3 shows  the  detected  edge  signal  from  the  differencing  circuits. 


encing  circuit  has  no  input  on  one  input  line  since  it  has  no  neighbor,  therefore  outputs 
from  this  cell  are  invalid  and  are  ignored. 


Conclusions 

This  chapter  compared  results  from  three  architectures  used  for  computing  edge 
locations  in  one-dimensional  analog  VLSI  networks.  It  has  shown  that  a Differenced 
Gaussian  algorithm  is  superior  to  a Difference  of  Gaussian  or  Laplacian  of  Gaussian 
implementation  in  silicon  networks  due  to  its  superior  signal  retention  characteristics. 
Signal  retention  is  essential  in  order  to  overcome  noise  sources  in  analog  computational 
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circuits.  The  Differenced  Gaussian  algorithm  also  is  the  least  complex  which  reduces 
size,  offsets,  and  power  consumption.  Lastly,  results  from  a chip  implementing  a one- 
dimensional Differenced  Gaussian  architecture  were  presented. 


CHAPTER  3 

NON-NEAREST  NEIGHBOR  DIFFERENCING 
Introduction 

From  the  architectural  discussions  in  Chapter  2 it  is  clear  that  noise  and  offsets 
limit  minimum  detectable  signal  levels.  Offsets  in  the  receptors  and  edge  detection  cir- 
cuits dominate  these  limitations,  especially  when  operating  in  subthreshold  [25],  [38], 
Steps  can  be  taken  to  minimize  processing  circuitry  offsets  [2]  by  modifying  layout  con- 
figuration and  size,  but  offsets  in  the  photoreceptors  are  strictly  limited  by  fabrication 
imperfections  and  size.  Previous  efforts  by  this  author  using  simple  transamp  layouts  and 
small  receptor  sizes  have  resulted  in  combined  offsets  from  the  receptors  and  buffers  on 
the  order  of  10  mV  to  one  standard  deviation.  These  results  were  the  best  obtained  from 
several  chip  designs  meant  to  explore  offset  minimization  techniques  [2],  These  results 
are  discussed  in  Chapter  4 and  Chapter  5 along  with  improved  transamp  and  receptor 
designs.  While  these  layout  techniques  did  reduce  offsets  in  the  processing  circuitry,  pho- 
toreceptor offsets  were  unaffected.  Other  techniques  to  reduce  offsets  such  as  switched- 
capacitor  filters  may  be  used,  but  this  research  centers  on  solutions  for  low-power,  contin- 
uous-time systems  and  therefore  does  not  address  discrete  time-sampled  systems. 

Filtering  recovers  some  lost  dynamic  range  by  reducing  offsets  in  the  processing 
stages  preceding  the  filtering  components  but  at  the  cost  of  also  smoothing  the  input  sig- 
nal. Filtering  does  nothing,  however,  to  compensate  for  offsets  in  subsequent  computa- 
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tional  circuitry;  thus  a technique  for  recovering  the  desired  signal  from  the  filtered 
response  is  needed  to  improve  performance.  This  chapter  presents  a Non-Nearest  Neigh- 
bor Local  Differencing  (NNND)  technique  for  recovering  lost  signal  strength  at  the 
expense  of  reducing  resolution.  Portions  of  this  chapter  have  previously  been  published 

[41]- 

One-Dimensional  Image  Filtering  and  Edge  Detection 

An  ideal  edge  in  a 1 -dimensional  system  can  be  modeled  as  a step  function.  Pro- 
cessing begins  by  applying  the  step  to  a 1-D  array  of  HRes  circuits  for  smoothing.  The 
HRes  response  characteristics  were  shown  in  Figure  2-1  where  the  spatial  filtering  is 
equivalent  to  convolving  an  input  with  the  network  transfer  function.  Following  a gradi- 
ent descent  approach,  spatial  differencing  is  performed  for  edge  localization.  This  compu- 
tation can  be  described  by  a discrete  Difference  of  Gaussian  (DoG)  function  as  shown  in 
equation  (3-6)  where  the  results  are  strictly  dependent  upon  the  spatial  position,  x,  and  fil- 
tering constant,  a. 

-(x-x')2  — (jc  — jc'  — l)2 

°°  2 00  2 

F(x,  a)  = f cb(x)-£-  <?  2c  dx'  - f <b(x)-A-  <?  20  dx'  (3-6) 
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Here  O(x) represents  the  input  step  function  and  ‘A’  is  the  input  signal  magnitude.  This 
function  can  be  simplified  to  the  results  shown  in  equation  (3-7)  and  has  the  solution  indi- 
cated in  equation  (3-8). 
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Increasing  the  filtering  results  in  smaller  localized  differences,  thus  offsets  and  noise  accu- 
mulated in  preceding  processing  stages  are  attenuated  proportional  to  the  filtering  applied. 
However,  since  filtering  also  smooths  the  edge  signals,  large  filtering  constants  require 
systems  to  distinguish  smaller  edge  signals.  This  limits  edge  detection  capabilities  to  the 
offset  levels  found  in  the  edge  detection  circuits. 

Since  the  edge  detection  circuitry  offsets  can  be  tens  of  millivolts  [17],  the  perfor- 
mance restrictions  can  be  severe.  Signal  differences  between  nodes  must  be  large  enough 
to  overcome  these  offsets  and  also  the  circuit’s  finite  gain  to  robustly  identify  edge  loca- 
tions. One  way  to  improve  edge  detection  is  to  increase  the  spatial  sampling  distance  used 
in  the  differencing  computation.  Since  inputs  are  spread  among  neighboring  pixels  by  fil- 
tering, increasing  the  spatial  distance  results  in  spanning  larger  percentages  of  the  desired 
signal  thus  increasing  the  signal  magnitudes  used  by  the  differencing  circuitry.  This  is 
represented  mathematically  by  replacing  the  absolute  nearest  neighbor  difference  in  equa- 
tion (3-8)  by  the  constant,  ‘D’,  as  shown  in  equation  (3-9). 


Figure  3- la  shows  the  results  of  computing  local  differences  using  equation  (3-9).  In  this 
example  a= 5,  D increases  from  1 to  5,  and  the  input  is  A=1  V.  By  increasing  D,  the  peak 
signal  magnitude  presented  to  the  differencing  circuitry  quadruples. 

Figure  3- lb  shows  a noisy  input  signal  and  its  corresponding  filtered  response. 
The  input  signal  is  a 40  mV  step  function  superimposed  on  a zero-mean  random  noise  sig- 
nal with  a standard  deviation  of  10  mV.  The  noise  signal  represents  the  photoreceptor  and 
buffer  offsets  and  the  deviation  corresponds  to  offset  measurements  previously  observed 
by  the  author  on  fabricated  designs.  A large  filtering  constant  is  required  to  reduce  the 
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offsets  and  increase  the  signal-to-noise  ratio  so  a value  of  o=10  has  been  chosen  for  the 
characteristic  length. 

Results  from  the  spatial  differencing  technique  are  shown  in  Figure  3-2a  where 
local  differences  taken  from  the  filtered  signal  in  Figure  3- lb  are  shown  using  spatial  sep- 
arations ofD=l,3,5,7,9,  and  11.  Signal  strength  is  improved  by  a factor  of  10  ffomD=l  to 
D=1 1 (where  the  peak  signal  increases  from  1 .5  mV  to  over  16  mV).  The  increased  signal 
is  needed  to  overcome  offsets  in  the  differencing  circuitry.  For  example,  Figure  3-2b 
shows  the  same  differencing  results  when  a zero-mean  random  noise  signal  having  a stan- 
dard deviation  of  2 mV  is  added  to  the  computation  to  account  for  offsets  in  the  differenc- 
ing circuitry.  It  can  be  seen  that  the  signal  recovered  using  D=1  cannot  be  distinguished 
from  the  noise  as  robustly  as  the  signal  recovered  using  D=1 1 . The  differenced  signals  are 
subsequently  thresholded  to  determine  edge  locations.  This  task  becomes  easier  and  the 


Computation  Magnitude  vs  Spatial  Distance 


Figure  3-1  (a).  Signal  magnitudes  as  a 
function  of  the  Spatial  Separation 
Constant,  ‘D\  The  family  of  curves 
shows  how  increasing  the  spatial 
separation  constant  increases  the 
available  signal  for  processing.  For  this 
example  0=5. 


Noisy  and  Filtered  Input  Signal 


Spatial  Location  (x) 


Figure  3-  1(b).  Noisy  input  signal  and 
corresponding  filtered  signal.  The  input 
signal  is  a 40  mV  step  function  with  a 
superimposed  random  noise  signal  having 
a standard  deviation  of  10  mV.  The 
filtering  constant,  o = 10. 
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Ideal  Computed  Differences 


Spatial  Location  (x) 

Figure  3 -2(a).  Local  differences 

computed  from  the  filtered  signal  of 
Figure  3- lb  using  idealized  offset- free 
differentiators.  The  curves  correspond  to 
D=  1,3,5, 7, 9,  and  11. 


Computed  Differences  Using  Noisy  Circuitry 


Spatial  Location  (x) 

Figure  3-2(b).  Responses  generated 
using  differentiators  having  random 
offsets  with  a standard  deviation  of  2 mV. 
Using  D=1  does  not  distinguish  the  edge 
signal  from  the  noise  as  well  as  D=5  and 
D=ll. 


results  more  robust  as  D increases.  The  results  from  a fabricated  one-dimensional  system 
are  presented  later  in  this  chapter. 


Two-Dimensional  Implementations  Incorporating  the  Spatial  Separation  Constant 

Applying  the  NNND  technique  to  a two-dimensional  system  yields  many  of  the 
benefits  observed  in  one-dimensional  systems.  However,  in  our  simulations  the  benefits 
gained  were  dependent  upon  the  differencing  scheme.  The  differencing  method  used  for 
comparison  computes  horizontal  and  vertical  differences  separately  to  isolate  edges  in  the 
x or  y direction.  Figure  3-3  shows  how  the  input  signal  was  generated.  Figure  3-3a  shows 
an  input  signal  consisting  of  a 25  x 25  pixel  tower  of  40  mV  magnitude  in  a noiseless  envi- 
ronment. Figure  3-3b  shows  the  input  superimposed  with  a zero-mean  random  noise  sig- 
nal having  a standard  deviation  of  10  mV.  Finally  Figure  3-3c  shows  the  result  of  filtering 
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(a)  Input  Signal 


o.o6 . (c)  Gaussian  Filtered  Signal 


(b)  Noise  Corrupted  Input  Signal 


Figure  3-3a  shows  a 2 dimensional  input 
signal  consisting  of  a 25  x 25  pixel  tower 
with  40  mV  magnitude.  Figure  3-3b 
shows  the  input  signal  superimposed  on 
a zero-mean  random  noise  signal  with  a 
standard  deviation  of  10  mV.  Figure  3- 
3 c shows  the  resulting  function  after 
Gaussian  filtering  with  a smoothing 
function  having  a=3. 


the  noise  corrupted  signal  with  a two-dimensional  Gaussian  smoothing  function  having 
ct=3. 

Computing  differences  on  Figure  3-3c  yields  the  results  shown  in  Figure  3-4.  Dif- 
ferencing was  performed  with  non-idealized  differentiators  having  zero-mean  random  off- 
sets with  a standard  deviation  of  2 mV.  Figure  3-4a  shows  the  resultant  from  using  a 
spatial  separation  of  D=1  and  Figure  3-4c  shows  the  corresponding  thresholded  output 
using  a 6 mV  threshold.  Since  the  signal  magnitudes  in  (a)  are  so  small  it  is  difficult  to  set 
a thresholded  level  to  unambiguously  discern  edge  locations.  Figure  3-3c  shows  that 
some  portions  of  the  edges  are  missing,  and  in  addition,  spurious  or  false  edges  have  been 
indicated.  Figure  3-4b  shows  the  differenced  results  of  Figure  3-3c  using  a spatial  separa- 
tion constant  of  D=6.  The  signal  peaks  are  much  larger  than  noise  values  with  some  signal 
peaks  exceeding  30  mV.  This  implies  that  almost  75%  of  the  original  40  mV  signal  has 
been  recovered  for  the  edge  detection  process.  Finally,  Figure  3-4d  shows  the  results  of 
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(a)  D=1  (b)  D=6 


S < (c)  Thresholded  Signal  ti  _ Lf)  Thresholded  Signal 


Spatial  Pixel  Location  Spatial  Pixel  Location 


Figure  3-4(a)  shows  results  from  computing  localized  differences  on  the  signal  in 
Figure  3-3c  using  D=1  with  thresholded  results  shown  in  (c).  The  figures  in  (b)  and 
(d)  are  equivalent  to  (a)  and  (c)  except  that  the  spatial  separation  constant  was  D=6 
and  the  threshold  was  increased  to  20  mV. 


applying  a 20  mV  threshold  to  the  results  in  (b).  If  narrow  edges  are  desired,  the  results 
shown  in  Figure  3-4d  can  be  filtered  using  one  of  many  edge  thinning  techniques  to  deter- 
mine actual  edge  locations.  It  is  clear,  though,  that  the  NNND  technique  improves  edge 
detection  in  noisy  or  low-contrast  images.  This  technique  will  likely  miss  narrow  edges 
but  these  edges  would  also  likely  be  missed  by  previous  techniques  due  to  the  amount  of 
filtering  required.  Therefore,  Non-Nearest  Neighbor  Differencing  should  not  adversely 
affect  system  performance  associated  with  narrow  edges. 


NNND  in  a Differenced  Gaussian  Architecture 

To  demonstrate  the  Non-Nearest  Neighbor  Differencing  Technique,  a switching 
network  has  been  incorporated  into  a Differenced  Gaussian  vision  processing  architecture. 
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Figure  3-5.  Oscilloscope  plot  of  a one-dimensional  Differenced  Gaussian  array  with  a 
13  mV  edge  signal  introduced  at  the  array  center  (Trace  2).  Trace  3 shows  the 
detected  edge  signal  from  the  differencing  circuitry. 


Figure  3-5  shows  an  oscilloscope  plot  of  the  one-dimensional  array  with  a 13  mV  edge 
signal  introduced  at  the  array  center  in  Trace  2 and  the  detected  edge  signal  in  Trace  3. 
This  is  the  same  array  depicted  in  Chapter  2.  No  filtering  has  been  applied  to  the  input 
signal  and  nearest-neighbor  differencing  is  employed  for  edge  localization. 

Figure  3-6  shows  the  same  array  but  an  approximate  filtering  constant  of  L = 5 has 
been  applied  while  the  Spatial  Separation  Constant  has  been  kept  at  D=l.  The  resulting 
minimum  detectable  edge  signal  is  22  mV.  It  is  interesting  to  note  that  the  minimum 
detectable  edge  signal  has  increased  from  that  shown  in  Figure  3-5  before  the  filtering  was 
applied.  The  reason  is  that  when  filtering  is  applied,  the  noise  and  offsets  introduced  in 
the  receptors  and  buffers  are  reduced  but  the  offsets  in  the  edge  detectors  are  not.  There- 
fore, the  filtering  is  spreading  out  the  edge  signal  presented  to  the  detectors  but  is  not 
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Figure  3-6.  Oscilloscope  plot  of  filtered  edge  signal  with  characteristic  length 
approximately,  L=  5 and  the  spatial  separation  constant,  D=l.  The  detectable  input 
edge  signal  has  been  increased  to  22  mV. 


reducing  the  edge  detector  offsets,  thus  reducing  the  signal-to-offset  (S/O)  ratio.  Filtering 
cannot  be  avoided,  though,  since  systems  operating  in  normal  environments  will  require 
filtering  to  reduce  the  noise  occurring  on  the  input  image  signal  in  addition  to  reducing  the 
offsets  occurring  in  the  processing  circuitry.  These  results  also  imply  that  offsets  in  the 
edge  detectors  are  dominating  the  array  performance. 

Increasing  D to  2 improves  the  minimum  detectable  signal  to  14  mV  as  shown  in 
Figure  3-7.  Also,  the  detected  edge  signal  has  moved  one  pixel  to  the  left  in  going  from 
D=1  to  D=2  which  is  an  artifact  of  using  NNND.  Since  comparisons  are  done  over  large 
spatial  distances,  detected  edge  locations  are  not  necessarily  the  physical  location  of  the 
edge  within  the  array  due  to  the  random  distribution  of  offsets  in  the  edge  detection  cir- 


cuits. 
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Figure  3-7.  Oscilloscope  plot  showing  14  mV  input  edge  signal  (Trace  2)  and  detected 
output  signal  (Trace  3)  using  D=2. 

Figure  3-8  shows  the  results  from  further  increasing  the  Spatial  Separation  Con- 
stant to  D=3.  The  minimum  detectable  edge  signal  has  decreased  to  1 1 mV  but  note  that 
the  detected  edge  location  has  not  changed.  Finally,  Figure  3-9  shows  that  by  increasing 
D to  7,  the  minimum  detectable  signal  level  improves  to  9 mV. 

There  are  several  points  of  interest  within  these  figures.  First,  since  the  photore- 
ceptors were  not  used  as  input  sources,  the  offset  levels  depicted  correspond  to  buffer  off- 
sets only  which  are  much  lower  than  the  combined  receptor/buffer  offset  levels.  The 
lower  offset  levels  allowed  for  reduced  threshold  settings  in  the  differencing  circuits 
resulting  in  small  edge  detection  levels  on  the  unfiltered  inputs  of  Figure  3-5.  This  case 
typically  does  not  occur  when  receptor  inputs  are  applied.  Second,  as  D increases  the 
number  of  unusable  differencing  circuits  at  the  right  end  of  the  array  increases  resulting  in 
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Figure  3-8.  Oscilloscope  plot  showing  an  1 1 mV  filtered  input  signal  using  L=5  and 
computed  edge  location  using  D=3. 


reduced  resolution.  Lastly,  the  threshold  settings  have  been  set  such  that  the  differencing 
circuits  trigger  on  edge  signals  slightly  above  the  offset  levels.  Therefore,  due  to  the  low 
offset  levels  in  the  buffer  circuits,  the  threshold  settings  have  been  primarily  determined 
by  offsets  in  the  edge  detection  circuits. 

Summary  and  Conclusions 

Performing  computations  such  as  differencing  in  analog  silicon  systems  presents 
designers  with  a unique  set  of  design  constraints.  Foremost  among  them  are  that  offsets 
inherent  in  analog  circuits  establish  minimum  signal  magnitudes  which  can  be  detected 
reliably.  Therefore,  it  is  important  to  enhance  signal  magnitudes  within  an  image  to 
impiove  robust  edge  detection.  Techniques  for  recovering  signals  in  noisy  environments 
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will  become  more  important  in  the  future  as  feature  sizes  and  components  continue  to 
shrink  due  to  the  inverse  relationship  between  device  size  and  offset  [42],  As  demon- 
strated in  this  chapter,  edges  can  be  detected  within  noisy  or  low -contrast  images  if  one  is 
willing  to  tradeoff  resolution  for  more  robust  performance. 

This  chapter  demonstrated  a NNND  technique  for  edge  enhancement  and  feature 
detection  in  analog  VLSI  early  vision  systems.  Simulation  results  have  been  shown  for 
both  one-  and  two-dimensional  systems  and  the  trade-offs  discussed.  Increasing  the  spa- 
tial separation  when  computing  intensity  differences  results  in  enhancing  significant 
image  features  at  the  cost  of  reducing  resolution  and  the  likelihood  of  missing  narrow  fea- 
tures. Enhancing  image  features  results  in  increasing  signal  levels  above  offset  levels 
resulting  in  lobust  performance  in  low-contrast  or  noisy  environments.  Lastly,  several 
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oscilloscope  plots  have  been  shown  detailing  the  performance  characteristics  associated 
with  the  Non-Nearest  Neighbor  Differencing  technique  in  association  with  a fabricated 
Differenced  Gaussian  vision  processing  architecture. 


CHAPTER  4 

CONTINUOUS-TIME  LOGARITHMIC  PHOTORECEPTORS 

Introduction 

Recent  efforts  have  investigated  optimizing  phototransduction  circuitry  for  spe- 
cific applications  such  as  time-adaptive  receptors  for  motion  detection  and  offset  reduc- 
tion [7],  [16],  The  problem  with  time-adaptive  receptors  is  that  motion  is  required  for  the 
system  to  respond.  In  the  absence  of  motion,  the  system  adapts  to  the  ambient  lighting 
conditions  settling  into  a state  where  the  circuitry  operates  on  very  small  inputs,  again  ren- 
dering performance  susceptible  to  offsets.  Requiring  motion  also  limits  minimum  detect- 
able velocities  to  those  where  motion  induced  responses  are  more  rapid  than  adaptation 
induced  responses.  Also  since  adaptation  occurs  slower  than  changes  from  signals  of 
interest,  low-contrast  environments  again  render  the  system  susceptible  to  offsets.  A dis- 
cussion of  several  types  of  receptors  can  be  found  in  [16]  and  [25], 

The  first  processing  step  in  a fully  integrated,  real-time,  analog  VLSI  vision  pro- 
cessing system  is  the  conversion  of  light  energy  into  electrical  signals  which  can  take  on 
the  form  of  voltages,  currents,  or  charges.  These  photo-conversion  elements  (photorecep- 
tors or  photodetectors)  are  designed  to  process  information  in  various  ways  ranging  from 
continuous-time  analog  [5],  to  time-adaptive  analog  [16],  to  discrete-time  continuous- 
level  [9]  designs.  Comparisons  among  various  receptor  topologies  can  be  found  in  [17] 
and  [43],  Each  conversion  method  has  an  associated  set  of  trade-offs  and  each  also  has  a 
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set  of  applications  to  which  it  is  best  suited.  This  chapter  discusses  characteristics  of  sev- 
eral receptor  topologies  implemented  in  both  CMOS  and  Bipolar  processes.  First  Mead’s 
original  receptor  [5]  is  presented  and  used  as  a baseline  for  comparison.  Next  a continu- 
ous-time, logarithmic  photoreceptor  incorporating  lateral  bipolar  transistors  in  a CMOS 
process  is  presented.  Lastly,  results  from  two  receptors  implemented  in  a 0.8  \xm  double- 
polysilicon bipolar  process  are  presented  as  an  example  of  photodetection  capabilities  in 
modem  processes. 

Mead’s  original  receptor  has  been  used  in  many  applications  [4],  [18],  several  of 
which  reported  that  high  receptor  offsets  limited  system  performance  in  low-contrast  envi- 
ronments. For  this  reason,  more  recent  designs  [7],  [15]  have  incorporated  time-adaptive 
receptors  or  photodiodes  [21]  which  are  less  susceptible  to  device  mismatch  offsets  as  pre- 
viously discussed.  As  for  photodiodes,  they  inherently  possess  superior  matching  charac- 
teristics compared  to  phototransistors,  but  since  they  operate  at  much  lower  current  levels 
they  typically  require  amplification.  This  amplification  circuitry  again  introduces  offsets 
into  the  solution.  To  this  authors’  knowledge,  no  quantitative  offset  measurements  of 
time-adaptive  or  photodiode  receptors  has  been  reported. 

At  low  to  moderate  intensity  ranges,  the  phototransistor  currents  produced  by 
Mead’s  receptor  are  small  enough  to  allow  the  PMOS  load  transistors  to  operate  in  sub- 
threshold resulting  in  a logarithmic  current-to-voltage  relationship  for  increased  dynamic 
range.  This  introduces  a trade-off  since  MOSFETs  operating  in  subthreshold  exhibit  poor 
matching  characteristics  due  to  the  same  exponential  voltage-to-current  relationship  [44], 
Trade-offs  are  presented  for  each  of  the  receptor  topologies.  Portions  of  this  chapter  have 
previously  been  published  [45],  [46], 
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Source/Drain  Diffusion 


(b) 


Figure  4-1.  The  schematic  diagram  of  the  original  logarithmic  photoreceptor 
developed  by  Mead  [5]  (a)  and  a cross-sectional  representation  of  a phototransistor 
(b). 


Photoreceptors  in  CMOS  Processes 


This  section  discusses  two  photoreceptors  implemented  in  a CMOS  process  for 
use  in  edge  detection  circuits.  First,  the  performance  of  Mead’s  original  receptor  is  pre- 
sented to  form  a baseline  for  comparison.  Next,  another  continuous-time,  logarithmic 
receptor  possessing  lateral  bipolar  transistors  as  the  load  devices  is  presented.  It  will  be 
shown  that  this  receptor  possesses  superior  signal-to-offset  characteristics  and  dynamic 
range  when  compared  to  Mead’s  receptor.  Offset  results  from  each  design  along  with 
results  of  the  photo-optic  dynamic  range  are  presented  for  each  topology.  Next  two  photo- 
detectors implemented  in  a 0.8  \xm  bipolar  process  are  presented  to  examine  detector  per- 
formance in  modem  processes. 

The  Original  Logarithmic  Photorer.eptnr 

Figure  4- la  shows  a schematic  of  the  original  logarithmic  photoreceptor  (OLP) 
developed  by  Mead  [5]  and  Figure  4- lb  shows  the  cross-sectional  representation  of  the 
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phototransistor.  This  design  has  been  extensively  used  and  tested  [4],  [14],  [43],  [47-50], 
The  phototransistor  is  a parasitic  PNP  device  inherent  in  standard  CMOS  N-well  pro- 
cesses. The  transistor’s  operation  begins  when  incident  photons  create  electron/hole  pairs 
in  the  base,  base-collector  depletion  region,  and  substrate  near  the  base.  These  carriers  are 
gathered  by  the  base  and  depletion  region  to  form  the  transistor’s  base  current.  The  base 
current  is  then  multiplied  by  the  current  gain,  [3,  to  form  the  collector/emitter  current. 
Thus  the  phototransistor  converts  light  energy  into  an  electrical  current  proportional  to  the 
incident  intensity.  The  current  is  drawn  from  a stack  of  two  diode-connected  PMOS  tran- 
sistors which  perform  two  functions:  first,  they  perform  a logarithmic  current-to- voltage 
conversion  when  operating  in  subthreshold  thereby  extending  the  photo-optic  dynamic 
range,  and  second  they  set  the  DC  operating  for  subsequent  processing  circuitry. 

The  Lateral  Bipolar  Photoreceptor 

Figure  4-2  shows  schematic  representations  of  both  Mead’s  Original  Logarithmic 
Photoreceptor  and  the  Lateral  Bipolar  Photoreceptor  (LBP)  for  comparison.  The  pho- 
totransistors used  in  both  receptors  are  identical  but  the  load  devices  are  different.  In  the 
original  receptor,  PMOS  load  devices  are  used  which  exhibit  a logarithmic  response  only 
when  operating  in  the  subthreshold  region.  In  the  LBP,  however,  lateral  bipolar  transistors 
(LBT’s)  formed  from  PMOS  devices  operated  in  their  lateral  bipolar  mode  [51]  are  used 
which  provide  a logarithmic  current-to-voltage  relationship  over  a larger  range  of  cur- 
lents.  This  increases  dynamic  range  and,  as  shown  later,  the  matching  characteristics  are 
better  for  the  LBT’s  [52], 

Seveial  trade-offs  are  associated  with  the  LBTs.  First,  they  require  a larger  physi- 
cal layout  area  versus  PMOS  devices.  Second,  there  is  no  buried  layer  under  the  transistor 
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(a) 


(b) 


Figure  4-2.  Schematic  representations  of  Mead’s  original  logarithmic  photoreceptor 
in  (a)  and  the  Lateral  Bipolar  Photoreceptor  in  (b).  The  phototransistors  in  both 
receptors  are  identical  but  the  load  devices  have  changed  from  PFETs  operating  in 
subthreshold  to  PFETs  operating  in  their  lateral  bipolar  mode  forming  pnp 
transistors.  The  gate  voltage  of  the  LBTs  can  be  increased  to  improve  matching 
characteristics  [51], 


to  collect  vertical  currents  thereby  creating  two  bipolar  transistors  operating  simulta- 
neously; one  operating  laterally  and  the  other  vertically  as  depicted  in  Figure  4-3.  The 
vertical  bipolar  reduces  the  collector  current  from  the  lateral  device  which  consequently 
reduces  the  lateral  transconductance.  This  can  be  remedied  by  using  a technology  incor- 
porating buried  collectors.  The  third  trade-off  is  that  an  additional  bias  voltage  is  required 
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Figure  4-3.  Cross-section  of  a lateral  bipolar  transistor  in  a standard  CMOS  process 
showing  the  resulting  lateral  and  vertical  transistors.  The  vertical  transistors  are 
parasitic  devices  formed  in  standard  CMOS  processes  lacking  a buried  collector 
layer. 
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Log  of  Intensity  (Foot-Lamberts) 


Figure  4-4.  Photo-optic  response  curves  taken  from  the  OLP  and  the  LBP.  The  OLP 
response  remained  logarithmic  over  4 to  5 orders  of  magnitude  while  the  LBP 
remained  logarithmic  over  7 to  8 orders  of  magnitude  of  illumination  intensity. 

for  the  gate  terminal  exceeding  the  receptors  emitter  voltage  (possibly  exceeding  the  chips 
supply  voltage)  to  achieve  the  best  matching.  This  bias  pushes  the  channel  away  from  the 
silicon  surface  and  into  the  substrate  thereby  avoiding  mismatch  errors  caused  by  defects 
occurring  along  the  silicon  surface. 

Measured  Results 

Both  receptors  were  fabricated  in  a 2 pm  Analog  N-well  process  available 
through  MOSIS.  Note,  the  phototransistors  used  in  the  LBP  were  double  the  size  of  the 
devices  used  in  the  OLP.  Figure  4-4  shows  the  photo-optic  response  from  both  receptors 
in  units  of  Foot-Lamberts.  Conversions  from  Foot-Lamberts  to  other  intensity  measure- 
ment units  are  shown  in  the  following  equations 
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1 Foot-Lambert  = 970.209  Candela/m  (4-10) 

1 W/m2  = 680  Lux  (4-11) 

1 Candela  (Cd)/m 2 = 4tt  Lux  (4-12) 

Thus  knowing  that  ambient  florescent  office  lighting  is  approximately  2.9  W/m2  [16]  one 
can  calculate  the  corresponding  value  to  be  . 162  Foot- Lamberts  which  is  noted  in  the  fig- 
ure for  reference.  Office  lighting  then  is  approximately  in  the  center  of  the  receptors  oper- 
ating range. 

The  OLPs  response  is  logarithmic  over  4 to  5 orders  of  magnitude  of  light  intensity 
while  the  LBP  response  remains  logarithmic  over  7 to  8 orders  of  magnitude.  The  LBP 
likely  has  an  even  larger  dynamic  range  than  the  data  suggests  but  the  test  equipment 
could  not  exceed  the  optical  intensities  shown  in  the  figure.  Note  also  that  the  output  volt- 
age produced  by  the  LBP  is  approximately  one  volt  higher  than  the  OLP  which  must  be 
considered  in  processing  circuitry  design. 

The  slope  of  each  curve  is  a measure  of  the  receptor’s  gain  which  is  used  to  deter- 
mine the  optic  resolution  or  the  minimum  intensity  difference  required  between  adjacent 
receptors  for  edge  detection.  The  slope  of  the  OLP  curve  is  207.5  mV/decade  while  the 
slope  of  the  LBP  curve  is  143.5  mV/decade.  These  values  are  the  signal  component  used 
to  compute  the  S/N  ratio.  The  LBP  response  suffers  from  the  lack  of  a burned  collector  as 
mentioned  previously.  Recovering  the  vertical  cunent  would  increase  the  slope  and 
improve  the  performance  as  will  be  shown  later  when  the  receptors  implemented  in  a 
modem  bipolar  process  are  discussed. 

Offset  data  were  taken  from  96  OLP  devices  and  108  LBP  devices  respectively. 
All  devices  were  sampled  at  several  points  along  each  photo-optic  curve  to  analyze 
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Offset  Voltage  (mV) 


Figure  4-5.  Gaussian  distribution  and  normalized  histogram  of  the  offsets  associated 
with  the  original  logarithmic  photoreceptor  at  .25  and  2.5  Foot-Lamberts  respectively. 
Note  that  matching  improves  as  the  illumination  increases. 


changes  in  offset  characteristics  with  illumination  level.  Figure  4-5  shows  the  Gaussian 
distribution  and  normalized  histogram  data  for  offsets  taken  .25  and  2.5  Foot-Lamberts 
respectively  in  the  OLP  devices.  An  important  characteristic  is  that  matching  improves  by 
20%  as  the  illumination  spans  the  receptor’s  dynamic  range. 

The  corresponding  offset  responses  for  the  LBP  are  shown  in  Figure  4-6.  In  this 
figure,  the  first  two  distributions  correspond  to  illumination  levels  of  .25  and  2.5  Foot- 
Lamberts  respectively  while  the  last  was  taken  at  100  Foot-Lamberts  to  examine  matching 
at  even  higher  illuminations.  Again  matching  improves  as  illumination  levels  increase. 
Moreover,  matching  improvements  of  47%  and  70%  respectively  are  observed  when  com- 
paring the  LBP  to  the  OLP  under  identical  illuminations. 
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LBP  Offset  Distributions 
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Figure  4-6.  Offset  distributions  associated  with  the  LBPs  at  .25,  2.5,  and  100  Foot- 
Lamberts  respectively.  The  first  two  distributions  correspond  to  the  intensity  levels 
shown  in  Figure  4-5  for  the  OLPs.  Again,  matching  improves  as  the  illumination 
levels  increase.  Additionally,  matching  improvements  of  47%  and  70%  respectively 
are  observed  when  compared  to  the  OLPs. 


The  signal-to-offset  ratios  are  calculated  by  dividing  the  slope  of  the  photo-optic 
response  curve  by  the  standard  deviation  of  the  offsets  as  indicated  in  the  appropriate  fig- 
ures. The  LBP’s  exhibit  superior  S/O  ratios  of  between  31%  and  137%  for  similar  intensi- 
ties when  compared  to  the  OLPs.  The  LBPs  tested  did  not  bring  the  LBTs’  gate  bias  off- 
chip,  the  bias  was  directly  connected  to  V^.  Therefore,  matching  can  likely  be  improved 
if  proper  biasing  is  applied  [51], 

Additionally,  the  receptor  offsets  are  a result  of  superimposing  mismatches  from 
both  the  phototransistor  and  load  devices.  If  these  offset  levels  were  dominated  primarily 
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by  the  phototransistor  mismatches,  a simple  solution  would  be  to  increase  the  number  of 
load  devices  until  a desired  signal-to-offset  ratio  is  achieved.  This  is  possible  since 
increasing  the  number  of  load  devices  effectively  increases  the  circuit’s  transresistance  or 
current-to-voltage  gain.  However,  the  decrease  in  offsets  with  increasing  illumination  lev- 
els for  both  receptors  can  be  attributed  to  the  transconductance-to-current  ratio  [44],  [53] 
in  the  load  devices.  Another  factor  contributing  to  receptor  mismatch  is  the  difficulty  in 
insuring  that  equal  and  uniform  illumination  levels  are  being  presented  to  all  the  receptors 
simultaneously  when  the  offsets  are  measured.  For  these  tests,  a calibrated  visible  light 
source  was  used  for  photo-optic  characterization,  however  spatial  illumination  variations 
are  still  likely  to  occur. 

Transistor  matching  can  be  described  by 


A T 8m 

— = 4- 

I GS  I 


(4-1) 


where  M represents  the  current  mismatch,  I is  the  bias  current,  gm  is  the  transconduc- 
tance, and  Qg  describes  the  change  in  gate  voltage  as  a function  of  bias  current.  For 
the  OLP,  the  equation  describing  the  subthreshold  transconductance  is  [53],  [55] 


s =-£ 
m nU, 

where  U t = kT/q,  and  n is  the  slope  factor  which  is  approximately  1 .3. 
sion,  the  transconductance  is 


(4-2) 


In  strong  inver- 


8m  ~ J2PJD 

where  P = \xCoxW/L.  Also  in  subthreshold, 


= nUjAn 


\!SJ 


(4-3) 


(4-4) 


and  in  strong  inversion, 


VGS  ~ a P^D  + VTO 


(4-5) 
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In  subthreshold  one  can  see  that  8m^D  increases  slowly,  proportional  to  1/n, 
while  A Vqs  increases  proportional  to  the  In  (//>)  . Therefore  the  parameter  of  interest 
becomes  A Vq$/ Id  which  decreases  at  a rate  proportional  to  In  (//))///)  . Thus  the 
ratio  M/1  effectively  decreases  as  the  bias  current  increases.  This  continues  into  strong 
inversion  where  #m/ 1 decreases  proportional  to  JId/Id  while  AVq^  also  decreases  at 
a rate  proportional  to  JTd/Iq.  Thus  one  can  conclude  that  net  offset  variations  will 
decrease  as  bias  current  increases. 


A similar  set  of  results  can  be  shown  for  the  LBT’s  where 


VGS  ~ V BE  “ UTln 


fI& 

K!SJ 


(4-6) 


which  closely  resembles  the  subthreshold  MOSFET  response  shown  in  equation  (4-4).  In 
the  LBTs,  however,  gm  ~ Iq  throughout  its’  operating  range,  therefore  the  parameter  of 
interest  is  AV^/Iq.  From  equation  (4-6),  it  can  be  shown  that  AVq^/Iq  decreases 
proportional  to  In  (Jc)/lc 


Photodetectors  in  a Bipolar  Process 

Two  additional  photodetectors  have  been  implemented  in  a 0.8  pm,  25  GHz  fT, 
double-polysilicon  bipolar  process  developed  for  wireless  applications  [54];  Figure  4-7 
shows  the  circuit  schematic  for  both.  A vertical  substrate  pnp  is  used  to  collect  and  con- 
vert electron-hole  pairs  created  in  the  substrate  into  an  emitter  current  just  as  in  the  CMOS 
devices.  The  p+  emitter,  n base,  and  p"  collector  are  formed  using  the  p+  extrinsic  base 
doping  of  the  npn,  the  lightly  doped  n collector  of  the  npn  bipolar  structure,  and  the  p"  sub- 
strate respectively.  The  emitter  current  is  drawn  from  a stack  of  three  diode-connected 
pnp  transistors  performing  a logarithmic  current-to-voltage  conversion.  Three  load 
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Figure  4-7.  Schematic  of  the  logarithmic 
photodetectors  implemented  in  bipolar  process. 
The  circuit  consists  of  a parasitic  bipolar 
phototransistor  and  a stack  of  three  diode 
connected  pnp’s  performing  a logarithmic 
current-to-voltage  conversion 


devices  were  used  in  this  implementation  to  adjust  the  dc  operating  point  lower  to  con- 


form with  the  transamp  buffer’s  input  common-mode  range.  Lastly,  the  n-base  of  both 


detectors  is  left  floating. 


The  photo-optic  response  curves  for  both  detectors  are  shown  in  Figure  4-8.  The 
photo-optic  dynamic  range  exceeds  five  orders  of  magnitude.  The  upper  dynamic  range 


Photo-Optic  Response  Curves 


Figure  4-8.  Photo-optic  response  curves  for  the  multiple  and  standard  detectors 
fabricated  in  a bipolar  process.  Office  lighting  is  approximately  .162  Foot-Lamberts 
as  indicated  in  the  figure. 
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limit  could  be  higher  but  the  test  equipment  was  unable  to  exceed  the  intensity  levels  indi- 
cated in  the  figure.  The  slope  of  both  curves  is  approximately  209.3  mV/decade  and  while 
the  photo-optic  responses  of  both  detectors  are  nearly  identical,  comparisons  to  the  CMOS 
responses  in  Figure  4-4  show  several  differences.  First,  the  detectors  from  the  Bipolar 
process  do  not  respond  to  ambient  lighting  below  approximately  5xl0"2  Foot- Lamberts 
which  is  1 1/2  orders  of  magnitude  higher  than  the  CMOS  devices.  This  is  probably  due 
to  the  shallower  base  doping  which  limits  collection  of  light  induced  electron/hole  pairs  in 
the  substrate  and  consequently  limits  low-lighting  performance.  Second,  the  slope  of  the 
Bipolar  process  detectors  is  on  the  order  of  the  OLP  curve  which  is  45.9%  greater  than  the 
LBP  curve.  The  higher  slope  is  due  to  having  an  additional  pnp  in  the  load  stack  and  to 
the  bipolar  process  having  burned  collectors  which  recovers  the  vertical  currents. 

The  difference  between  the  two  implemented  detectors  is  that  the  standard  detector 
is  formed  using  a single  highly  doped  emitter  region  inside  a single  n-base  tub  while  the 
multiple  detector  is  formed  using  multiple  highly  doped  emitter  regions  inside  multiple  n- 
base  tubs.  Figure  4-9  shows  the  top  and  cross-sectional  views  of  the  standard  detector. 
The  total  area  of  the  detector  region  for  both  devices  is  30  pm  x 39.2  pm.  Figure  4-10 
shows  the  top  view  of  the  multiple  detector.  In  this  design,  the  multiple  emitter  regions 
are  bussed  together  to  form  the  output  current.  The  cross-sectional  view  is  the  same  as  in 
the  standard  detector. 

Offset  characteristics  are  shown  in  Figure  4-11  for  the  multiple  detector  and  in 
Figure  4-12  for  the  standard  detector.  Offsets  were  measured  by  setting  the  ambient  light- 
ing condition  constant  and  measuring  the  output  voltage  from  each  detector  in  the  array. 
Offsets  were  measured  at  intensities  of  .25,  1,  and  100  Foot-Lamberts  respectively.  Sig- 
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Figure  4-9.  Top  view  and  cross-sectional  view  of  the  standard  detector. 


nal-to-offset  calculations  are  shown  in  the  respective  figures.  The  standard  detector 
exhibits  a superior  S/O  ratio  compared  to  the  multiple  detector  over  the  ambient  lighting 
conditions  tested.  The  offsets  are  primarily  due  to  mismatches  in  load  devices  but  some 
mismatch  is  due  to  the  phototransistors.  Note  when  making  comparisons  to  the  CMOS 


Top  View 


Figure  4-10.  Top  view  of  the  multiple  detector. 
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Offset  Voltage  (mV) 

Figure  4-11.  Multiple  detector  offset  distributions  in  Bipolar  process. 

results  in  Figure  4-5  and  Figure  4-6  that  the  CMOS  realizations  have  undergone  some 
optimization  where  the  bipolar  detectors  have  not. 

Conclusions 

This  chapter  presented  results  from  four  photoreceptors;  two  implemented  in  a 
CMOS  process  and  two  implemented  in  a Bipolar  process.  The  two  CMOS  receptors 
were  compared  in  the  areas  of  photo-optic  dynamic  range  and  signal-to-offset  ratio.  It  was 
shown  the  LBP  exhibits  superior  dynamic  range  and  S/O  ratio  when  compared  to  the  OLP. 
Receptors  from  a Bipolar  process  were  also  presented  to  examine  potential  performance 
improvements  associated  with  using  modem  processes.  It  was  shown  that  modem  pro- 
cesses, especially  processes  incorporating  burried  collector  layers,  demonstrate  potential 
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Figure  4-12.  Standard  detector  offset  distributions  in  Bipolar  process. 

performance  enhancements  for  continuous-time  receptors  by  collecting  vertical  collector 
currents  and  subsequently  increasing  receptor  gain.  A trade-off  associated  with  modem 
processes,  however,  is  decreased  sensitivity  in  low-lighting  environments  due  to  shallower 
doping  regions. 


CHAPTER  5 
TRANSAMP  OFFSETS 

Introduction 

Minimum  detectable  signal  levels  in  vision  systems  depend  upon  the  cumulative 
offsets  and  finite  gain  exhibited  by  the  photoreceptors,  followers,  and  edge  detectors. 
Photoreceptor  response  characteristics  were  discussed  in  Chapter  4,  therefore,  this  chapter 
discusses  performance  characteristics  associated  with  the  subsequent  processing  stages- 
followers  and  edge  detectors.  The  chapter  includes  a presentation  of  measured  transamp 
offsets  and  a discussion  of  parameters  contributing  to  those  offsets  for  a variety  of  realiza- 
tions. The  data  collected  from  those  transamp  realizations  are  presented  to  characterize 
random  offset  performance. 

Offsets  can  be  broken  into  two  components:  systematic  and  random.  Systematic 
offsets  are  due  to  architectural  imbalances  within  the  circuit  resulting  from  different  paths 
from  the  inputs  to  outputs  in  addition  to  the  Early  effect  [56],  [57],  Random  offset  is 
defined  as  an  unpredictable  output  variation  caused  by  device  mismatch.  Therefore,  sys- 
tematic offset  is  a constant  or  predictable  error  and  random  offset  is  an  unpredictable  error. 
Both  mechanisms  exist  simultaneously,  however,  so  their  effects  are  superimposed.  The 
vision  architectures  discussed  in  this  work  are  primarily  concerned  with  random  offsets 
since  systematic  offsets  are  canceled  in  the  computational  circuitiy.  The  following  section 
addi esses  offsets  in  continuous-time  vision  systems.  The  subsequent  sections  discuss  off- 
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set  characteristics  of  various  transamps  incorporating  architectural  and  layout  variations. 
Lastly,  this  chapter  concludes  with  a discussion  of  topology  and  layout  considerations  for 
reducing  offsets.  Portions  of  this  chapter  have  previously  been  published  [46], 

Offsets  in  Vision  Architectures 

Device  mismatch  has  been  studied  extensively  as  applied  to  inter-device  mismatch 
and  fundamental  circuit  component  mismatch  such  as  those  found  in  current  mirrors  and 
differential  pairs  [2],  [42],  [44],  [51],  [52],  [56-62],  Offsets  in  more  complex  circuits, 
such  as  transamps  or  opamps,  have  been  reported  but  typically  the  reported  values  are 
based  on  low  quantity  averages  or  single  quantity  measurements  [63-65]  and  represent 
both  the  systematic  and  random  offset  components.  Random  offsets  exhibited  by  more 
complex  circuits  have  not  been  characterized  as  well  due  to  the  large  variety  of  designs 
and  the  cost  to  fabricate  and  test  large  quantities. 

As  discussed  in  Chapter  3,  random  offsets  degrade  computational  accuracies,  lim- 
iting low-contrast  performance.  Offsets  in  the  voltage  followers,  buffering  the  receptor 
outputs  into  the  spatial  filters,  increase  filtering  requirements  thereby  attenuating  high  fre- 
quency information  in  desired  signals.  Additionally,  offsets  in  the  edge  detection  circuitry 
require  higher  thresholds  to  avoid  false  edge  detections. 

In  the  Differenced  Gaussian  architecture  depicted  in  Figure  5-1,  systematic  offsets 
accumulated  in  the  receptors  and  buffers  are  canceled  by  the  spatial  differencing  circuits. 
Any  constant  or  uniform  offset  introduced  by  these  circuits  is  seen  at  both  input  terminals 
of  the  differencing  circuitry  effectively  producing  no  difference  and  subsequently  no  out- 
put. When  random  offsets  are  introduced,  however,  they  appear  as  discontinuities 


Differenced  Gaussian  Architecture 
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Figure  5-1.  The  Differenced  Gaussian  architecture  introduced  in  Chapter  3 has  been 
repeated  for  the  readers  convenience. 


between  adjacent  pixels.  These  offsets  can  be  reduced  by  the  filtering  circuitry  which 
spreads  the  input  signals  among  some  number  of  neighboring  cells  but  at  the  expense  of 
also  spreading  the  desired  input  signal  among  the  same  number  of  cells. 

While  error  sources  preceeding  the  spatial  filters  are  reduced,  offsets  originating 
within  the  spatial  differencing  circuitry  are  not.  These  offsets  are  modeled  by  superimpos- 
ing the  edge  detectors  input  referred  random  offset  voltage  onto  the  filtering  circuit  out- 
puts creating  discontinuities  between  cells.  Two  circuits  used  to  perform  spatial 
differencing  are  shown  in  Figure  5-2.  Figure  5-2a  is  the  original  absolute  value  circuit 
designed  by  Mead  [36]  which  is  composed  of  two  5-transistor  transamps  and  two  current 
mirrors  which  produce  an  output  current  proportional  to  the  input  voltage  difference.  A 
thresholding  transistor  has  been  added  to  complete  the  circuit’s  function.  Both  transamps 
operate  in  an  open-loop  configuration  yielding  the  high  gain  needed  to  detect  small  edge 
signals. 

The  transamp  in  Figure  5-2a  produces  an  output  cunent  proportional  to  the  differ- 


ence in  input  voltages  as  described  by 
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Absolute  Value  Circuits 


(a) 


(b) 


Figure  5-2.  Two  absolute  value  circuits  are  shown  above.  The  original  circuit,  (a),  was 
designed  by  Mead  [36]  and  is  composed  of  two  5-transistor  transamps,  two  current 
mirrors,  and  a thresholding  transistor.  The  second  circuit,  (b),  is  a simplified  version 
designed  to  reduce  random  offset  variations. 


where  Ibias  is  the  transamp  bias  current,  ‘n’  is  the  slope  factor,  and  Ut  = kT/q . A simi- 
lar equation  describes  Iout2.  The  additional  current  mirrors  shown  in  ‘a’  rectify  the  tran- 
samp output  currents  based  on  the  operating  point.  For  example,  assume  Vin2  is  at  a fixed 
voltage  and  Vinl  is  slowly  increased.  While  the  voltage  difference  remains  within  the 
transamp’s  linear  range,  the  current  from  both  mirrors  is  summed  on  the  node  Vout.  When 
the  voltage  difference  becomes  large  enough  to  drive  TAl’s  output  to  the  positive  supply, 
the  corresponding  current  mirror  is  driven  into  cutoff  leaving  TA2  as  the  only  information 
source. 

The  second  absolute  value  circuit,  Figure  5-2b,  is  a modification  of  the  wide  range 
transamp  [36]  with  the  lower  NMOS  transistor  mirror  pair  replaced  by  a set  of  threshold- 
ing transistors  controlled  by  an  off-chip  bias  voltage.  The  second  circuit  was  designed  to 


(5-1) 
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reduce  the  number  of  circuit  components  and  thereby  reduce  the  random  offsets  varia- 
tions. Note  that  subthreshold  equations  shown  below  have  left  out  the  constant  Ut  for  sim- 
plicity. Therefore  all  equations  are  assumed  to  be  normalized  to  the  factor  Ut  unless 
otherwise  noted.  The  output  current  equation  for  this  transamp  can  be  derived  starting 
from  Kirchhoff  s Current  Law  (KCL) 


*outi  - h~h 


(5-2) 


In  addition,  it  can  be  shown  that  [36] 
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where  k = 1/n  which  is  another  form  of  describing  the  transistor  slope  factor.  There- 


fore, assuming  an  ideal  current  mirror  relationship, 
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and  substituting  Vs  = 0 or  ground  yields  a description  for  Ioutl  valid  over  the  transamp’s 
linear  range 
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A similar  equation  can  be  shown  for  Iout2. 

The  remainder  of  this  chapter  is  organized  as  follows:  the  next  section  discusses 


the  offset  characteristics  of  current  mirrors  and  differential  pairs  and  how  those  offsets 
affect  transamps.  Following  this  is  a discussion  of  how  offsets  are  affected  by  the  DC 
operating  point.  Next  is  a presentation  of  measured  offset  results  from  ten  transamp 
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implementations.  Lastly,  current  measurements  from  lateral  bipolar  and  lateral  bipolar 
cascode  mirrors  are  presented. 

Transamp  Component  Offset  Characteristics 

Two  fundamental  transamp  components  are  current  mirrors  and  differential  pairs. 
From  the  transistor  offset  characteristics  addressed  in  Chapter  4,  it  is  shown  that  the 
transconductance-to-current  ( Gm/I ) ratio  describes  the  potential  mismatch  between 
devices.  Device  mismatch  is  described  in  equation  (4-1)  where  AVGS  represents  the 
change  in  gate  voltage  with  respect  to  changing  operating  point.  For  current  mirrors, 
reducing  the  Gm/I  ratio  reduces  the  mirror  offset.  This  theoiy  shows  that  the  Gm/I 
ratio  decreases  as  bias  current  increases  thereby  improving  device  mismatch  performance. 
This  implies  that  current  mirrors  possess  better  matching  characteristics  in  strong  inver- 
sion which  agrees  with  previously  published  results  [44], 

The  opposite  is  true  for  differential  pairs.  Large  Gm/I  ratios  allow  differential 
pairs  to  compensate  for  mismatched  drain  currents  with  smaller  input  offset  voltages. 
Thus  differential  pairs  exhibit  superior  offset  characteristics  in  subthreshold  versus  strong 
inversion.  Overall,  circuits  in  this  research  are  concerned  with  subthreshold  operation  to 
minimize  power  consumption,  therefore  most  data  presented  in  the  following  sections  is 
taken  from  devices  operating  in  subthreshold.  An  interesting  question,  though,  is  whether 
offsets  in  the  mirrors  or  differential  pairs  dominate  performance.  It  will  be  shown  empiri- 
cally that  offsets  originating  in  the  current  mirrors  dominate  transamp  offset 
characteristics  in  these  implementations. 
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Transamp  Offsets 


This  section  presents  offset  measurements  taken  from  ten  transamp  realizations 
incorporating  various  topological  and  layout  variations.  Data  from  each  implementation 
was  taken  from  approximately  100  devices  and  each  realization  with  the  exception  of  one 
was  constrained  to  a single  process  run.  Therefore  an  analysis  of  multiple  run  offset  char- 
acteristics cannot  be  performed.  Each  section  discusses  one  realization  with  the  best  pre- 
sented first  and  variations  presented  subsequently.  All  realizations  were  implemented  in  a 
2 pm  Double-Poly,  Double-Metal,  Analog,  N-well  process  incorporating  vertical  NPN 
transistors  provided  through  MOSIS  [6]  except  for  the  last  which  was  fabricated  in  a 0.8 
pm  double-polysilicon,  bipolar  process. 

One  characteristic  of  the  transamps  presented  is  that  as  the  current  mirror  transis- 
tors move  lower  in  saturation,  the  offsets  increase.  To  explain  this,  recall  from  equation 
(4- 1 ) that  transistor  drain  current  mismatch  is  related  to  the  rate  of  change  in  the  Gm/IDS 
ratio.  To  further  examine  this,  one  can  see  from  equation  (5-7)  that  in  subthreshold  the 
Gm/IDs  rati°  is  constant  with  respect  to  changes  in  VDS. 


Where  n is  the  slope  factor,  Ut  = kT/q  is  the  thermal  voltage,  and  VA  is  the  Early  volt- 
age. 

VGS,  however,  is  not  constant  with  respect  to  VDS.  Beginning  with  the  first-order 
subthreshold  drain  current  equation  and  accounting  for  the  Early  voltage  as  shown  in 
equation  (5-8)  [36] 
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one  can  solve  for  VGS  as  shown  in  equation  (5-9). 
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Taking  the  partial  derivative  with  respect  to  VDS,  as  shown  in  equation  (5-10), 
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results  in  the  expression  in  equation  (5-11)  where  VGS  is  expressed  as  a function  of  VDS. 
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This  equation  is  plotted  in  Figure  5-3  where  Ut  is  set  to  40  mV  and  the  Early  volt- 
age magnitude,  VA,  is  set  to  20  V.  The  value  of  V^  was  chosen  based  on  previous  transis- 
tor measurements  from  this  process  for  a W=6  pm  x L=6  pm  PMOS  device.  As  can  be 
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seen,  AVGS  changes  at  a rate  greater  than  Gm/IDS  resulting  in  larger  offset  variances 
when  the  transistors  are  operated  lower  in  saturation.  As  will  be  seen  from  the  data,  this 
relationship  is  observed  when  the  offset  deviations  increase  with  increasing  input  bias  lev- 
els. 


Five-Transistor.  NMOS  Input  Transamp 

Performance  comparisons  among  the  various  implementations  reported  here  show 
the  best  random  offset  characteristics  are  obtained  from  implementations  comprising  the 
fewest  transistors.  The  transamp  topology  yielding  the  lowest  random  offset  variations  is 
shown  in  Figure  5-4.  The  circuit  consists  of  an  NMOS  differential  pair  input  stage  with  an 
active  PMOS  current  mirror  load.  The  DC  bias  current,  Ibias,  is  set  by  an  NMOS  transistor 
whose  gate  voltage  is  controlled  through  an  off-chip  global  bias  line. 

The  Gaussian  distribution  and  histogram  data  of  the  offset  voltages  taken  from  108 
devices  with  input  voltages  of  2.5  V and  4 V respectively  are  shown  in  Figure  5-5a  and 
Figure  5-5b.  The  DC  bias  current  was  approximately  28  nanoamps  (nA).  The  offset  stan- 
dard deviations  are  1 .62  mV  and  1 .76  mV  respectively  yielding  a 140  pV  difference  over 
the  input  range  tested.  The  input  voltages  were  varied  to  examine  the  relationship 


Figure  5-4.  Five  transistor  transamp  realization 
yielding  the  smallest  random  offset  variations. 
The  circuit  is  composed  of  an  NMOS 
differential  pair  input  stage  with  an  active 
PMOS  current  mirror  load.  The  circuit  shown 
is  configured  as  a buffer/follower  which  was 
the  method  used  to  test  the  offset  voltage.  The 
DC  bias  current  is  controlled  through  an  NMOS 
bias  transistor  whose  gate  was  connected  to  a 
global  bias  line. 
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between  random  offsets  and  operating  point  which  imply  that  the  current  mirrors  are  more 
adversely  affected  by  lower  VDS  values  than  the  differential  pair  is  positively  affected  by 
higher  VDS  values. 

The  transistors  composing  the  differential  pair  and  current  mirror  were  laid  out 
using  a common-centroid  geometry  to  cancel  fabrication  gradients  in  any  direction.  Each 
current  mirror  transistor  is  composed  of  two  PMOS  transistors  of  size  W=22  pm  by  L=1 1 
pm  yielding  composite  W=44  pm  by  L=1 1 pm  transistors.  Each  differential  pair  tran- 
sistor is  similarly  configured  of  two  transistors  sized  at  W=20  pm  by  L=9  pm  forming 
composite  W=40  pm  by  L=9  pm  sized  transistors.  Transistor  sizes  were  maximized 
within  the  constraints  of  not  exceeding  a 50  pm  cell  width,  for  system  resolution  consid- 
erations, to  better  average  local  parameter  variations. 

This  topology  was  implemented  a second  time  with  slightly  smaller  transistors  to 
examine  size  versus  layout  considerations.  The  second  implementation  also  used  com- 
mon-centroid layouts  but  the  differential  pair  transistors  were  composed  of  two  transistors 


Gaussian  Distribution  and  Histogram  of  Transamp  Offset  Voltages 


Figure  5-5.  Gaussian  distribution  and  histogram  plot  of  offset  voltages  associated  with 
the  five-transistor  NMOS  input  transamp.  The  results  in  (a)  and  (b)  differ  only  in  that 
the  input  voltage  was  changed  from  2.5  V to  4 V which  resulted  in  a 140  pV  increase 
in  the  offset  deviation. 
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Gaussian  Distribution  and  Histogram  of  Transamp  Offset  Voltages 


Figure  5-6.  Gaussian  distributions  and  histogram  plots  of  the  second  set  of  NMOS 
input  transamp  offsets.  The  distributions  are  for  two  input  voltages,  IV  and  2.5  V, 
taken  to  examine  the  offset  variation  as  a function  of  the  DC  operating  point.  The 
offset  degrades  approximately  110  pV  as  the  operating  point  increases.  The  DC  bias 
current  in  the  bias  transistor  was  approximately  15  nA. 

sized  at  W=16  pm  by  L=7  pm  yielding  composite  W=32  pm  by  L=7  pm  transistors  and 
the  mirror  transistors  were  made  up  of  two  W=16  pm  by  L=8  pm  devices  yielding  com- 
posite W=32  pm  by  L=8  pm  transistors  respectively.  The  bias  transistor  was  W=12  pm 
by  L=7  pm  Due  to  the  topology  of  the  off-chip  buffering  amplifiers,  input  bias  voltages 
above  approximately  3.7  V could  not  be  tested  reliably. 

Figure  5-6  shows  the  offset  distributions  for  the  second  set  of  transamps  taken 
from  108  devices  contained  in  a single  process  run.  The  distributions  were  taken  at  input 
voltages  of  1 V and  2.5  V respectively  over  which  the  offset  response  degraded  by  approx- 
imately 110  pV . Comparing  these  curves  to  those  in  Figure  5-5  reveals  that  the  second 
realization  exhibits  worse  offset  characteristics  attributed  primary  to  smaller  transistor 
sizes.  In  the  first  realization,  the  mirror  transistors  were  1 89%  larger  than  those  in  the  sec- 
ond realization  while  the  differential  pair  transistors  were  nearly  161%  larger.  The  bias 
current  was  approximately  15  nA  in  these  measurements. 
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Gaussian  Distribution  and  Histogram  of  Transamp  Offset  Voltages 


<u 


Figure  5-7.  Gaussian  distributions  and  histogram  plots  of  the  third  set  of  NMOS  input 
transamp  offset  voltages.  Curve  (a)  shows  offset  the  distribution  at  a bias  current  of  5 
nA  and  curve  (b)  shows  the  distribution  at  a bias  current  of  1 .2  p A. 

A third  implementation  incorporating  even  smaller  transistors  was  tested  and  the 
offset  results  taken  from  128  devices  are  shown  in  Figure  5-7.  The  NMOS  differential 
pair  was  composed  of  transistors  sized  at  W=24  pm  by  L=7  pm  while  the  PMOS  mirrors 
were  sized  at  W=24  pm  by  L=8  pm  respectively  with  all  devices  laid  out  in  a common- 
centroid  configuration.  Curve  (a)  was  taken  using  a bias  current  of  5 nA  while  curve  (b) 
was  taken  using  a bias  current  of  1.2  pA;  input  voltages  for  both  tests  were  2.5  V.  The 
change  in  offset  deviation  between  the  curves  is  .5  mV  which  shows  how  significantly  the 
offset  decreases  as  the  bias  current  increases.  These  results  imply  that  increased  offsets  in 
subthreshold  are  attributed  to  the  current  mirror  matching  as  previously  discussed. 

Conclusions  from  these  results  are  that  transamp  offsets  are  dominated  by  the  cur- 
rent mirrors  which  depend  heavily  on  DC  operating  conditions.  There  are  two  methods 
for  minimizing  offsets:  first  is  to  operate  deep  in  strong  inversion  and  second  is  to  operate 
deep  in  saturation.  Also,  increasing  transistor  size  and  incorporating  more  sophisticated 
layout  techniques  such  as  common-centroid  or  inter-digitated  geometries  reduces  offsets 
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by  averaging  localized  process  variations.  For  all  three  realizations,  the  circuit  compo- 
nents have  been  implemented  using  common-centroid  layouts.  Comparisons  between 
inter-digitated,  common-centroid,  and  other  more  conventional  layouts  are  shown  in  the 
following  sections. 

Five-Transistor.  PMOS  Input  Transamp 

One  variation  of  the  previous  topology  is  to  change  the  input  stage  to  a PMOS  dif- 
ferential pair  with  an  NMOS  current  min  or  active  load.  Figure  5-8  shows  the  offset  distri- 
bution taken  from  108  devices  with  the  PMOS  input  transistors  sized  at  W=24  pm  by  L=7 
pm  and  the  NMOS  transistors  sized  at  W=24  pm  by  L=8  pm.  Curve  (a)  corresponds  to 
a bias  current  of  3 pA  while  curve  (b)  corresponds  to  a bias  current  of  20  nA.  The  input 
voltage  for  both  tests  was  2.5  V.  Comparing  Figure  5-8  to  Figure  5-7  it  can  be  seen  the 
offset  deviation  is  slightly  worse  for  the  PMOS  input  transamp  in  strong  inversion  while 
they  are  the  same  in  weak  inversion.  This  leads  to  the  conclusion  that  NMOS  mirrors 

Gaussian  Distribution  and  Histogram  of  Transamp  Offset  Voltages 


Figure  5-8.  Gaussian  distribution  and  histogram  data  for  the  PMOS  input  transamp. 
Curve  (a)  corresponds  to  a bias  current  of  3 pA  while  curve  (b)  corresponds  to  a bias 
current  of  20  nA.  The  input  voltage  for  both  tests  was  2.5  V. 
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Figure  5-9.  Wide-range  transamp  topology. 
This  design  was  implemented  to  gauge  the 
offset  performance  of  the  five  transistor 
designs.  It  has  a nearly  rail-to-rail  output 
swing  and  a balanced  design.  Trade-offs  are 
increased  size  and  power  consumption. 


exhibit  better  matching  characteristics  than  PMOS  mirrors;  most  likely  due  to  their  higher 
Early  voltages. 

Wide-Range.  NMOS  Input  Transamps 

Several  realizations  of  another  topology,  shown  in  Figure  5-9,  was  tested  to  gauge 
the  offset  performance  of  the  five-transistor  realizations.  Several  desirable  characteristics 
of  this  ‘wide-range’  transamp  [57]  include  a nearly  rail-to-rail  output  swing  and  balanced 
design.  Detractors  are  increased  size  and  power  consumption.  The  first  implementation 
consisted  of  the  NMOS  input  transistors  laid  out  using  a common-centroid  geometry  with 
the  current  mirrors  laid  out  using  an  interdigitated  configuration  [2],  Four  transistors  of 
size  W=6  pm  by  L=6  pm  are  combined  to  form  composite  W=24  pm  by  L=6  pm  input 
transistors,  while  the  mirrors  are  composed  of  two  transistors  each  of  size  W=6  pm  by 
L=6  pm  forming  composite  W=12  pm  by  L=6  pm  devices.  Figure  5-10  shows  the 
Gaussian  distribution  and  histogram  data  of  the  offset  measurements  taken  from  124 
wide-range  transamps  from  a single  process  run.  Figure  5- 10a  corresponds  to  measure- 
ments taken  in  strong  inversion,  approximately  2 pA  bias  current,  while  Figure  5-1  Ob 
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Gaussian  Distribution  and  Histogram  of  Transamp  Offset  Voltages 


Figure  5-10.  Gaussian  distribution  and  histogram  data  for  the  random  offsets 
associated  with  the  wide-range  transamps.  Curve  (a)  corresponds  to  measurements 
taken  in  strong  inversion  while  curve  (b)  corresponds  to  subthreshold  operation. 


corresponds  to  subthreshold  operation,  approximately  12  nA  bias  current.  Comparisons  to 
the  five-transistor  NMOS  input  transamp  data  show  these  devices  have  distinctly  higher 
offsets,  especially  in  subthreshold.  Also,  comparisons  between  curve  (a)  and  (b)  again 
show  how  offsets  increase  with  decreasing  bias  current  level  leading  to  the  conclusion  that 
mirror  mismatch  again  dominates  the  offset.  The  two  additional  current  mirrors  likely 
exacerbate  the  offset  problems  resulting  in  the  higher  offset  deviations. 

A second  realization  of  this  same  topology  was  fabricated  but  the  transistor  sizes 
were  maximized  based  upon  50  pm  cell  width  constraints  to  examine  offset  improve- 
ments. The  NMOS  input  transistors  were  composed  of  four  transistors  in  a common-cen- 
troid layout  of  size  W=ll  pm  by  L=6  pm  forming  composite  W=44  pm  by  L=6  pm 
transistors.  Each  mirror  transistor,  both  NMOS  and  PMOS,  was  formed  from  two  inter- 
digitated  transistors  of  size  W=7  pm  by  L=6  pm  resulting  in  W=14  pm  by  L=6  pm 
transistors.  Figure  5-11  shows  the  Gaussian  distribution  and  histogram  data  collected 
from  96  wide-range  transamps  from  a single  process  run  at  a bias  current  of  2 pA.  Com- 
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Gaussian  Distribution  and  Histogram  of  Transamp  Offset  Voltages 


Figure  5-11.  Gaussian 
distribution  and  histogram 
data  collected  from  96 
wide-range  transamps  with 
increased  device  area  and  a 
bias  current  of  2 pA. 
Comparisons  to  Figure  5- 
10a  show  that  the  increased 
transistor  area  has  improved 
offset  performance. 


paring  these  results  to  those  in  Figure  5- 10a  shows  that  increasing  transistor  size  does 
improve  offset  performance.  Much  of  the  improvement  is  attributed  to  matching  in  the 
differential  pair,  though,  since  the  mirroring  transistor  sizes  were  only  increased  16%  from 
the  previous  implementation  versus  83%  for  the  differential  pair  transistors. 

A third  implementation  of  the  wide-range  transamp  was  fabricated  but  this  time 
the  transistor  ‘flavors’  were  reversed.  In  this  case,  the  design  consisted  of  a PMOS  input 
stage  and  biasing  transistor  with  two  NMOS  current  mirror  loads  and  one  additional 
PMOS  current  mirror  to  transfer  the  negative  input  signal  to  the  output.  Also,  the  layout 
of  all  components  was  changed  to  common-centroid  in  an  attempt  to  optimize  device 
matching.  The  PMOS  differential  pair  transistors  were  composed  of  four  devices  sized  at 
W=13  pm  by  L=7  pm  forming  composite  W=52  pm  by  L=7  pm  transistors  and  the 
composite  mirror  transistors  were  W=52  pm  by  L=8  pm.  Figure  5-12  shows  the  Gauss- 
ian distribution  and  histogram  data  associated  with  the  offset  measurements  from  128 
devices  taken  from  a single  process  mn.  The  bias  current  for  these  measurements  was 
approximately  2 p A . As  can  be  seen,  the  standard  deviation  is  slightly  worse  than  the 


83 


Gaussian  Distribution  and  Histogram  of  Transamp  Offset  Voltages 


Figure  5-12.  Gaussian 
distribution  and  histogram  data 
associated  with  the  third 
implementation  of  the  wide- 
range  transamp  topology.  This 
implementation  contained  a 
PMOS  differential  input  stage 
with  PMOS  biasing  transistor, 
two  NMOS  current  mirrors, 
and  a final  PMOS  mirror.  All 
components  utilized  a 
common-centroid  layout. 
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previous  implementations  which  is  attributed  to  poor  PMOS  matching  characteristics  in 
the  differential  pair  [42],  [60], 


The  conclusions  drawn  from  these  results  support  those  drawn  from  the  five-tran- 
sistor implementations.  Increasing  transistor  size,  incorporating  common-centroid  or 
inter-digitated  layouts,  and  keeping  matching  transistors  as  close  as  possible  on  chip 
improves  offsets.  A note  on  the  PMOS  input  wide-range  transamp  realization,  even 
though  offset  performance  for  this  design  was  worse  than  the  previous  NMOS  input 
designs  using  inter-digitated  layouts,  this  author  believes  that  a transamp  incorporating  a 
common-centroid  layout  and  a NMOS  input  stage  with  the  same  transistor  sizes  would 
provide  better  offset  performance  than  the  inter-digitated  versions. 


BiCMOS  Transamps 

To  examine  possible  matching  improvements  associated  with  bipolar  transistors, 
several  BiCMOS  trans conductors  were  designed  and  tested.  The  first  topology  is  shown 
in  Figure  5-13  where  vertical  npn  transistors  available  in  the  2 pm  Analog  N-well  CMOS 
process  [6]  have  been  used  to  form  the  differential  pair  and  to  replace  the  NMOS  current 
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Figure  5-13.  Wide-range  BiCMOS 
trans  conductor  incorporating  vertical 
NPN  transistors.  The  vertical  npn’s  were 
available  in  the  2 pm  Analog  N-well 
process  [6],  They  were  used  to  improve 
the  differential  pair  and  current  mirror 
matching.  Due  to  the  lower  input 
impedance,  a Darlington  input  stage  was 
used.  Also,  to  improve  matching  in  the 
mirror,  an  NMOS  ‘helper’  transistor  is 
used  to  provide  the  npn  base  currents. 


mirror.  Two  design  changes  were  necessitated  by  the  bipolars  inclusion;  first  due  to  the 
low  input  impedance,  NMOS  transistors  were  used  to  form  a Darlington  input  stage 
thereby  increasing  the  transamps  input  impedance  and  avoid  receptor  loading.  Second  to 
avoid  current  mirror  mismatches  due  to  the  bipolar  base  currents,  an  NMOS  ‘helper’  tran- 
sistor was  added. 

The  Gaussian  distribution  and  a histogram  of  the  offset  measurements  collected 
from  96  devices  from  a single  process  run  are  shown  in  Figure  5-14.  The  bias  current  was 
approximately  1 pA  and  the  results  show  an  offset  standard  deviation  of  4.9  mV  which 
was  worse  than  expected.  There  are  several  reasons  for  the  high  offset  variations;  first, 
while  matched  FETs  were  laid  out  close  together  no  other  layout  measures  were  taken  to 
reduce  mismatches.  Transistor  sizes  were  W=10  pm  by  L=6  pm  for  the  PMOS  mirrors 
and  W=6  pm  by  L=7  pm  for  the  NMOS  Darlington  transistors,  and  in  addition,  all  npn’s 
were  minimum  geometry.  As  shown  in  the  previous  sections,  increasing  transistor  size 
and  incorporating  inter-digitated  or  common-centroid  layouts  should  improve  matching 
but  these  techniques  were  not  incorporated  here  due  to  size  constraints.  Also,  increasing 
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Gaussian  Distribution  and  Histogram  of  Transamp  Offset  Voltages 


Figure  5-14.  Gaussian 

distribution  and  histogram 
plot  of  offset  data  associated 
with  the  BiCMOS 

transconductor  incorporating 
vertical  npn  transistors. 


the  npn  emitter  size  should  improve  matching.  Lastly,  the  added  complexity  and  topogra- 
phy of  the  Darlington  input  stage  increased  the  offsets. 

A second  BiCMOS  transamp  incoiporating  parasitic  lateral  bipolar  transistors  was 
implemented  and  the  schematic  is  shown  in  Figure  5-15.  The  NMOS  differential  pair  is 
standard  but  the  current  mirror  is  formed  from  lateral-bipolar  cascode  transistors 


Figure  5-15.  Lateral-bipolar  cascode 
transamp.  This  transamp  incorporates 
two  lateral-bipolar  cascode  transistors 
(LBCT’s)  in  place  of  a PMOS  current 
mirror.  The  bipolar  transistors  should 
improve  current  mirror  matching  while 
the  cascoding  transistors  will  limit 
systematic  mismatches  by  reducing  Vce 
changes  with  operating  point.  The 
cascoding  transistors  also  provide  the 
amplifier  with  higher  gain. 
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(LBCT’s).  Lateral-bipolar  transistor  (LBT)  operation  is  discussed  in  Chapter  4 but 
LBCT’s  have  been  introduced  to  improve  the  LBT’s  poor  Early  voltage,  VA,  which  not 
only  increases  the  transamps  gain  but  reduces  systematic  mismatch  by  limiting  Vce 
changes  with  changing  DC  operating  point  [66].  The  trade-offs  associated  with  the 
LBCT’s  are  increased  size  and  layout  complexity  along  with  reducing  the  upper  common- 
mode output  swing. 

Figure  5-16  shows  the  Gaussian  distributions  and  histogram  data  for  the  lateral- 
bipolar  cascode  transamp  offsets  at  1 V and  2.5  V input  voltages  respectively.  As  before, 
the  standard  deviation  degrades  slightly  as  the  operating  point  increases  but  the  magnitude 
variation  is  decreased  by  the  cascoding  transistor  by  limiting  the  Vce  swing.  The  NMOS 
input  transistors  were  laid  out  using  a common-centroid  configuration  with  composite 
transistor  sizes  of  W=18  pm  by  L=9  pm  and  the  bias  current  was  approximately  5 nano- 
amps (nA).  The  LBCTs  are  intended  solely  for  use  in  subthreshold  due  to  the  lack  of  a 
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Figure  5-16.  Gaussian  distribution  and  histogram  of  lateral-bipolar  cascode  transamp 
offsets  for  input  bias  voltages  of  1 V and  2.5  V respectively.  Offset  standard  deviations 
are  very  good  and  variations  with  operating  point  have  decreased  to  roughly  30  pV 
versus  previous  designs.  The  operating  current  was  approximately  5 nA. 
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Figure  5-17.  Darlington  transamp 
configuration  used  in  the  bipolar 
chip.  The  Darlington  input  stage 
increases  the  input  impedance  and 
decreases  the  bias  current  draw  from 
the  photoreceptors  thereby  reducing 
loading. 


buried  layer  to  collect  vertical  currents  as  depicted  in  Figure  4-3.  This  can  be  mitigated  by 
increasing  the  number  of  emitter  contacts  but  at  the  expense  of  increased  area  and  layout 
complexity. 

The  LBTs  have  been  shown  to  possess  excellent  matching  characteristics  when 
used  as  current  mirrors  [51]  but  matching  for  LBCTs  have  not  been  previously  shown. 
Therefore,  the  next  section  shows  matching  results  from  several  current  mirrors  composed 
of  LBTs  and  LBCTs  over  a large  range  of  operating  currents. 

Bipolar  Transamp  Realization 

Another  transamp  realization  was  implemented  in  a 0.8  pm  double-polysilicon, 
Bipolar  process  designed  for  wireless  applications.  The  transamp  topology  is  shown  in 
Figure  5-17  where  a Darlington  input  stage  is  used  to  increase  the  input  impedance  and 
keep  the  transamp  from  loading  the  photoreceptors.  The  Darlington  stage  decreases  the 
required  bias  input  current  by  a factor  of  p 2 as  shown  in  equation  (5-12) 

r ^Bias 


(5-12) 
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where  P is  defined  as  the  transistor  current  gain  and  IBias  is  the  transamp  bias  current.  In 
CMOS  processes  this  is  not  necessary  since  FETs  inherently  possess  high  input  imped- 
ances. The  NMOS  biasing  transistor  is  a high  VT  device  provided  in  the  bipolar  process. 

The  area  consumption  of  each  npn  was  4.8  pm  by  8.8  pm  and  the  area  consump- 
tion for  each  pnp  was  17.2  pm  by  17.4  pm  resulting  in  a total  transistor  area  consump- 
tion of  767.4  pm^ . Data  was  collected  from  125  devices  from  five  chips  implemented  in 
a single  process  ran.  The  random  offsets  were  measured  at  a bias  current  of  1 pA  and  at 
an  input  voltage  of  2.5  V resulting  in  a random  offset  of  1.994  mV  to  one  standard  devia- 
tion as  shown  in  Figure  5-18.  The  offset  was  slightly  higher  than  expected  but  the  addi- 
tional offset  is  attributed  to  the  Darlington  input  stage.  Since  these  devices  were  tested  on 
a single  process  run,  no  optimization  was  performed  to  try  to  reduce  offsets  as  was  done 
with  the  CMOS  designs.  All  devices  were  near  minimum  geometry  so  these  results 
should  be  near  worst  case  for  this  topology. 

To  summarize  the  results  of  this  section,  a listing  of  key  parameters  for  each  circuit 
topology  has  been  gathered  into  Table  5-1.  Not  included  in  this  summary  is  the  type  of 


Figure  5-18.  Random  offset 
distribution  and  histogram  for 
a Darlington  input  transamp 
implemented  in  a Bipolar 
process. 


Darlington  Transamp  Offset  Voltages 


89 


Circuit  Type 

Process 

Technology 

Transistor 
Area  pm^ 

Bias  Current 

Standard 
Deviation  at 
2.5  V Input 

5 Trans, 
NMOS  Input 

2 pm  Analog 
CMOS 

1688 

28  nA 

1.62  mV 

5 Trans, 
NMOS  Input 

2 pm  Analog 
CMOS 

960 

15  nA 

1.96  mV 

5 Trans, 
NMOS  Input 

2 pm  Analog 
CMOS 

720 

5 nA 

2.6  mV 

5 Trans, 
PMOS  Input 

2 pm  Analog 
CMOS 

720 

20  nA 

2.6  mV 

9 Trans,  Wide 
Range, 
NMOS  Input 

2 pm  Analog 
CMOS 

720 

12  nA 

5.27  mV 

9 Trans,  Wide 
Range, 
NMOS  Input 

2 pm  Analog 
CMOS 

1032 

2 pA 

2.56  mV 

9 Trans,  Wide 
Range,  PMOS 
Input 

2 pm  Analog 
CMOS 

3224 

2 pA 

3.1  mV 

BiCMOS 
with  Darling- 
ton Input 

2 pm  Analog 
CMOS 

4302 

1 pA 

4.9  mV 

BiCMOS 
using  LBCTs 

2 pm  Analog 
CMOS 

3404 

5 nA 

1.94  mV 

Darlington 
Input  Bipolar 

0.8  pm 
Bipolar 

767.2 

1 pA 

1.994 

Table  5- 1 : Transamp  Offset  Voltage  Summary. 


layout  used  due  to  the  variations  and  inter-mixing  of  simple,  interdigitated,  and  common- 
centroid  layouts  within  a single  realization.  Also,  not  included  in  the  area  consumption  is 
the  biasing  transistor  size.  Even  though  bias  transistor  size  is  not  listed,  it  is  important  to 
minimize  bias  current  variations  from  transamp-to-transamp  in  order  to  limit  DC  bias 
point  variations. 
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Lateral  Bipolar  Current  Mirror  Mismatch 


DC  Operating  Current  Level  (A) 


Figure  5-19.  Lateral  bipolar  current  mirror  mismatch  measured  from  four  current 
mirrors  fabricated  in  a single  process  run.  Current  levels  below  100  pA  were  difficult 
to  measure  with  the  equipment  and  test  setup  used. 


Matching  in  Lateral  Bipolar  and  Lateral  Bipolar  Cascode  Current  Mirrors 


Data  from  the  various  transamps  presented  show  that  current  mirror  mismatch  is  a 
crucial  and  often  dominant  factor  in  transamp  offsets.  MOSFET  matching  for  both  PMOS 
and  NMOS  devices  has  been  previously  characterized  [42],  [59],  [60],  however,  matching 
in  lateral  bipolar  and  lateral  bipolar  cascode  devices  in  a CMOS  process  has  not  been  thor- 
oughly characterized.  Matching  measurements  from  both  devices  are  shown  here  to  better 
quantify  the  offset  characteristics  of  the  previous  transamp  measurements  and  of  the  Lat- 
eral Bipolar  Photoreceptors  discussed  in  Chapter  4. 

Current  measurements  have  been  taken  from  four  sets  of  lateral  bipolar  and  lateral 
bipolar  cascode  current  mirrors  fabricated  in  a single  process  mn  [6],  Figure  5-19  shows 
the  percent  drain  current  mismatch  taken  from  the  four  mirror  pairs  as  a function  of  DC 


91 


Lateral  Bipolar  Cascode  Current  Mirror  Mismatch 


Figure  5-20.  Lateral  bipolar  cascode  current  mirror  mismatch  measured  from  four 
mirror  pairs  fabricated  on  a single  process  run.  Note  the  larger  DC  current  range  from 
approximately  200  pA  to  10  pA  where  the  percent  mismatch  is  below  5%. 


operating  current.  The  error  is  below  5%  only  over  an  approximate  current  range  of  100 
nA  to  8 pA.  Although  not  as  good  as  the  lateral  bipolar  cascode  devices,  the  improved 
matching  at  low  currents  when  compared  to  PMOS  devices  is  the  primary  reason  for  the 
improved  matching  in  the  Lateral  Bipolar  Receptors. 

Figure  5-20  shows  the  corresponding  results  from  the  four  lateral  bipolar  cascode 
current  mirrors  fabricated  on  the  same  process  mn.  Matching  below  5%  now  extends 
from  approximately  200  pA  to  10  pA  improving  the  dynamic  range  by  almost  3 orders  of 
magnitude.  The  improved  matching  at  lower  current  levels  is  advantageous  for  subthresh- 
old transamp  operation  and  the  cascoding  improves  mismatch  with  changing  operating 
point  as  indicated  in  Figure  5-16. 
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Conclusions 

This  chapter  has  discussed  offsets  as  applied  to  analog  signal  processing  to  quan- 
tify how  those  offsets  affect  vision  processing  architectures.  Offset  measurements  from 
ten  different  transamp  topologies  have  been  presented  along  with  conclusions  as  to  where 
the  offsets  are  originating  within  the  devices.  In  most  cases,  the  offsets  are  primarily  orig- 
inating in  the  current  mirrors.  Also,  an  explanation  was  provided  as  to  why  the  offsets 
changed  with  operating  point.  Finally,  current  mirror  mismatch  results  from  several  mir- 
rors composed  of  lateral  bipolar  and  lateral  bipolar  cascode  devices  were  presented  with 
conclusions  as  to  why  these  devices  are  advantageous  when  compared  to  PMOS  devices. 


CHAPTER  6 

A FLOATING-GATE  EDGE  DETECTION  SYSTEM 


Introduction 

Thus  far  several  vision  processing  architectures  and  their  components  have  been 
presented.  Characteristics  of  each  component  have  been  discussed  with  an  analysis  of 
associated  trade-offs  in  terms  of  dynamic  range,  signal-to-noise,  matching,  and  size.  Also, 
an  architectural  comparison  and  analysis  was  performed  from  which  a Non-Nearest 
Neighbor  Differencing  technique  for  improving  signal  retention  in  low-contrast  environ- 
ments was  derived. 

Based  on  these  efforts,  an  edge  detection  architecture  was  implemented  to  quantify 
low-contrast  performance.  The  chip  was  implemented  in  a 2 pm  Analog  N-well  process 
available  through  MOSIS  [6],  It  incoiporates  the  lessons  learned  in  Chapters  2 through  5 
for  architectural  design,  improved  matching,  signal  retention,  size,  and  power  consump- 
tion. An  additional  step  has  been  taken  in  an  attempt  to  further  improve  the  signal-to-off- 
set performance,  however,  by  incorporating  floating-gate  buffers  and  floating-gate  edge 
detectors  to  cancel  or  reduce  random  offsets.  This  chapter  presents  the  transamp  and  edge 
detector  designs  along  with  their  offset  results.  In  addition,  the  overall  performance  of  the 
edge  detection  chip  is  presented  in  terms  of  minimum  detectable  signal  values. 
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Figure  6-1.  Three  principal 
components  of  a floating-gate 
circuit.  The  tunneling 
transistor  removes  electrons 
from  the  floating-gate,  the 
injection  transistor  puts 
electrons  onto  the  floating- 
gate,  and  the  floating-gate 
capacitor  stores  the  charge  or 
information. 


Floating-Gate  Node 


An  Introduction  to  Floating-Gate  Circuits 

A burgeoning  field  of  research  related  to  on-chip  signal  processing  is  floating-gate 
circuits.  These  circuits  have  been  demonstrated  in  adaptive,  real-time  learning  circuits  for 
neural  networks,  and  due  to  their  long  term  signal  retention  capability,  they  are  also  used 
in  a number  of  trimming  applications  [67-73],  These  circuits  utilize  Fowler-Nordheim 
tunneling  [74]  and  hot-electron  injection  [75]  to  remove  electrons  from  and  inject  elec- 
trons onto  floating  polysilicon  gates.  An  advantage  of  these  systems  is  that  they  have  been 
adapted  for  use  in  current  silicon  processing  technologies.  Therefore,  specialized  pro- 
cesses are  not  required  thereby  greatly  reducing  design  cost  and  time. 

For  a thorough  discussion  of  the  governing  principles  associated  with  floating-gate 
circuits,  the  reader  is  encouraged  to  refer  to  the  references  cited  previously.  As  a brief 
overview  of  the  functional  design  principles,  however,  there  are  three  primary  components 
to  a floating-gate  memory  cell  [73]:  a tunneling  transistor,  a hot-electron  injection  transis- 
tor, and  a floating-gate  capacitor  as  shown  in  Figure  6-1 . The  tunneling  transistor  utilizes 
Fowler-Nordheim  tunneling  to  remove  electrons  from  the  floating  capacitor  node.  The 
injection  transistor  utilizes  hot-electron  injection  to  put  electrons  onto  the  floating  capaci- 
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tor  node.  Finally,  the  floating-gate  capacitor  stores  the  resulting  charge.  Thus  by  control- 
ling the  amount  of  charge  transferred  by  these  mechanisms,  one  can  control  or  set  a 
floating-node  voltage  for  many  years. 

There  are  a variety  of  ways  to  utilize  the  programmed  voltage  as  a control  mecha- 
nism. In  this  research,  the  floating-gate  memory  cell  is  used  in  conjunction  with  a simple 
two-transistor  inverting  amplifier  to  tune  the  gate  voltage  on  a biasing  transistor  to  cancel 
offset  voltages  in  the  transamp  buffers  and  edge  detection  circuits  [73],  Tuning  the  bias 
transistors  gate  voltage  adjusts  the  bias  transistors  drain  current  which  us  used  to  adjust 
the  transamps  output.  A more  detailed  discussion  follows  in  the  floating-gate  buffer  and 
floating-gate  edge  detector  sections  along  with  offset  results  taken  from  the  buffers. 

The  Floating-Gate  Architecture 

A schematic  of  the  edge  detection  architecture  implemented  is  presented  in  Figure 
6-2  where  the  processing  components  consist  of  Lateral  Bipolar  Photoreceptors,  floating- 
gate  voltage  buffers,  HRes  spatial  filters,  a Non-Nearest  Neighbor  Differencing  network 
capable  of  spatial  separations  ranging  from  zero  to  seven,  floating-gate  edge  detectors, 
and  single-phase  shift  registers  for  time-multiplexing  the  analog  and  digital  information 
off-chip.  Not  shown  in  Figure  6-2  are  the  off-chip  buffering  amplifiers  driving  the  pad 
frame  and  off-chip  loads  in  addition  to  a three-to-eight  line  decoder  used  to  select  the  spa- 
tial separation  constant , ‘D\ 
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Figure  6-2.  Floating-gate  edge  detection  architecture.  The  architecture  consists  of 
Lateral  Bipolar  Photoreceptors,  floating-gate  voltage  buffers,  HRes  spatial  filters,  a 
Non-Nearest  Neighbor  Differencing  Network,  floating-gate  edge  detectors,  and  shift 
registers.  Not  shown  are  the  off-chip  buffering  amplifiers  for  driving  the  pad  frame  and 
off-chip  loads  and  a three-to-eight  line  decoder  used  for  controlling  the  spatial 
separation  constant. 


Photoreceptors 

The  photoreceptors  are  the  same  Lateral  Bipolar  Receptors  discussed  in  Chapter  4 
and  the  schematic  diagram  is  shown  in  Figure  4-1 . In  this  implementation,  the  total  recep- 
tor area  was  30  pm  x 30  pm  which  is  almost  half  that  of  the  LBPs  reported  previously. 
Due  to  not  having  access  to  the  test  equipment  used  to  gather  the  data  presented  in  Chapter 
4,  offset  and  photo-optic  response  characteristics  were  not  taken.  Not  having  access  to 
this  equipment  also  limited  the  ability  to  expose  the  receptors  to  a uniform  lighting  source 
for  offset  calibration.  These  results  will  be  discussed  further  in  the  next  section. 
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Figure  6-3.  Floating-gate  buffer  circuit  used  to  correct  buffer  and  photoreceptor 
offsets.  The  circuit  has  two  operational  phases:  learning  and  buffering.  During 
learning  the  cascoded  seven-transistor  transamp  in  the  left  side  of  the  figure  operates 
in  an  open-loop  configuration  while  a negative  feedback  loop  adjusts  the  gate  voltage 
of  M6.  This  transistor  controls  the  drain  current  in  the  current  mirror  to  effectively 
compensate  for  offsets.  The  mode  of  operation  is  controlled  by  an  off-chip  control 
signal,  denoted  as  ‘m’,  which  is  inverted  on-chip  to  produce  complementary  control 
signals  for  the  transmission  gates. 


Floating-Gate  Buffers 

To  reduce  offsets,  a floating-gate  buffer  circuit  was  designed  to  simultaneously 
correct  for  random  and  systematic  offsets  in  both  the  photoreceptors  and  buffers.  Figure 
6-3  shows  the  floating-gate  buffer  topology  which  is  composed  of  a cascoded  seven-tran- 
sistor transamp,  a feedback  loop,  and  several  transmission  gates  for  controlling  the  mode 
of  operation.  Operation  is  broken  into  two  modes:  learning  and  buffering.  Learning  is 
meant  to  be  a one-time,  infrequent  event  which  takes  place  in  a controlled  environment. 
After  learning,  the  transamps  are  ready  for  normal  buffering  operations  and  the  learning 
circuitry  is  turned  off. 
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Processing  begins  with  the  seven-transistor  cascoded  transamp.  It  is  composed  of 
a conventional  NMOS  differential  pair  but  the  current  mirror  is  a cascoded  PMOS  design. 
The  current  mirror  is  cascoded  for  two  reasons;  first  to  increase  the  gain  and  second  to 
limit  VDS  changes  in  the  mirror  transistors.  Increasing  the  gain  improves  performance  in 
the  learning  phase  and  limiting  VDS  changes  reduces  offset  mismatch  with  changing  oper- 
ating point.  The  importance  of  limiting  VDS  changes  was  discussed  in  Chapter  5. 

A gain  increase  is  needed  to  overcome  the  inverter  threshold  voltage  offsets.  In 
Figure  6-3,  the  inverter  in  question  is  depicted  by  transistors  M7,  M8,  and  Ml 5.  The 
inverter  threshold  doesn’t  necessarily  correspond  to  the  center  of  the  transamp  dynamic 
range.  Instead,  the  threshold  occurs  when  the  currents  from  M2  and  M4  are  close  in  mag- 
nitude but  not  equal.  The  inverter  threshold  is  defined  as  the  gate  voltage  required  to 
make  the  currents  through  M7  and  M8  equal.  Therefore  by  equating  the  NMOS  current  in 


equation  (6-1)  to  the  PMOS  current  shown  in  equation  (6-2), 
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it  can  be  shown  that  the  inverter  threshold  is  represented  by  equation  (6-3)  if  one  assumes 
that  the  slope  factor  is  equal  for  both  transistor  types. 
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This  can  be  simplified  if  all  device  specific  parameters  for  the  NMOS  and  PMOS  transis- 
tors are  assumed  equal  which  yields  a first  order  approximation  to  the  inverter  threshold  as 
shown  in  equation  (6-4). 
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Rough  threshold  approximations  can  be  determined  from  equation  (6-4)  but  the  random 
threshold  variations  are  determined  by  the  parameters  grouped  under  I08  and  I07  in  equa- 
tion (6-3)  and  the  VDS  factors  in  both  equations.  VDS15  is  described  by  equation  (6-5) 
where  it  can  be  seen  to  be  slightly  dependent  upon  IQ  which  is  the  same  for  VDS7  and 
VDS8.  M 1 5 has  been  included  to  limit  the  inverter  current  during  learning.  Since  the  elec- 
tron injection  process  occurs  slowly,  the  transamp  will  move  slowly  through  the  inverter 
threshold  region  where  both  transistors  are  in  the  ‘on’  state  and  both  can  source  large  cur- 
rents. , N 
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Inverter  offset  effects  can  be  reduced  by  increasing  the  transamp,  or  first  stage, 
gain  just  as  is  done  for  offset  and  noise  reduction  in  conventional  analog  amplifier  design 
[56],  This  is  demonstrated  in  equation  (6-6)  where  the  open-loop  response  from  a generic 
four  stage  amplifier  is  presented. 

^out-  O^in  + ^off2J^2'^3J^4  + ^off3^3^4  + ^off4^4  (6-6) 

Yin  represents  the  desired  input  signal  to  the  first  gain  stage,  the  Voff  terms  represent  the 
random  input  offset  voltages  for  each  stage,  and  the  ‘A’  terms  represent  the  gain  for  each 
stage.  Correlating  the  components  in  equation  (6-6)  to  the  system  in  Figure  6-3,  the  terms 
with  a subscript  one  correspond  to  the  gain  and  offset  of  the  transamp,  those  with  subscript 
two  correspond  to  the  gain  and  offset  of  the  first  inverter  stage  and  so  forth.  Therefore  by 
making  A,  sufficiently  large,  the  input  signal  and  offsets  from  the  first  stage  are  amplified 
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Figure  6-4.  Seven-transistor  transamp 
partitioned  to  analyze  the  differential 
mode  gain.  Assuming  matched 
transistors,  one  can  assume  that  all 
differential  signals  are  equally  divided 
between  the  positive  and  negative  input 
terminals.  Also,  if  the  differential 
voltage  is  zero,  the  currents  i}  and  i3  are 
equal  with  each  having  magnitude  Ibias/2. 


Vddl 


sufficiently  to  render  offsets  in  succeeding  stages  negligible  leaving  transamp  offsets 
dominating  noise  performance.  Thus  effort  can  be  concentrated  on  reducing  transamp  off- 
sets for  improving  overall  performance. 

The  transamp  can  be  broken  into  halves  to  analyze  the  differential-mode  gain  as 
shown  in  Figure  6-4.  Breaking  the  amplifier  into  the  two  halves  is  possible  if  one  assumes 
that  the  transistors  are  matched  and  that  all  differential-mode  signals  are  equally  divided 
between  the  positive  and  negative  input  terminals.  This  results  in  the  currents  /,  and  i3 
being  equal  with  magnitude  Ibias/2  under  pure  common-mode  inputs.  Thus,  under  differ- 
ential-mode inputs  the  currents  are  described  by 
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it  can  be  shown  that 
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where  Gm  is  the  transamp  trans conductance  and  gm  is  the  NMOS  transistor  transconduc- 


tance. 
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To  complete  the  gain  computation,  it  is  necessary  to  determine  the  transamp  output 
resistance  which  is  computed  by  the  parallel  combination  of  output  resistances  from  the 
M2  and  M4  transistors  as  shown  in  equation  (6-10). 

(6-10) 

The  respective  resistances  are  shown  in  equation  (6-11)  and  equation  (6-12). 

R02  = r02  (6-H) 

R04  = r04[l  + gm4(1  +TU)roio]  (6-12) 

In  these  equations  gm  is  the  transconductance,  r0  is  the  small-signal  output  resistance,  and 
p is  defined  in  equation  (6-13). 
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Here  y is  the  bulk  effect  parameter,  <J>F  is  the  Fermi  level,  and  VSB  is  the  source-to-bulk 
voltage. 

From  these  equations,  the  transamp  output  resistance  in  the  subthreshold  region 
can  be  defined  as 
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This  can  be  simplified  to 
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ID4nA,ioA,4Ut  + ID2X,2[nA-10Ut  + 1 + r|4] 
when  one  assumes  that  the  drain  currents  are  equal  and  knowing  rQ  = 1/A,ID.  Again 

^ = 1/VA  is  the  channel  length  modulation  parameter  and  VA  is  the  Early  voltage.  Thus 
the  amplifier  gain  is 
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Therefore,  since  gm  = ID/nUt  equation  (6-16)  can  be  reduced  to 

1 nX6Ut  + 1 


TAMP 


nUtnXgX,4Ut  + + 1 + 1|4] 


(6-17) 


102 


Based  on  previous  measurements  made  from  6 pm  x 6 pm  devices  fabricated  in  the  2 
pm  Analog  N-well  process,  VA2  = 42.0  V and  VA4  = 24.4  V = VA6.  If  one  assumes 
that  n=7  and  Ut=026  V,  the  transamp  gain  would  be  Atamp  = 1704 . From  equation  (6- 
17),  it  can  be  seen  that  the  amplifier  gain  has  been  reduced  to  fundamental  device  parame- 
ters which  are  not  operating  point  dependent  and  thus  the  maximum  gain  is  achieved. 

From  the  information  contained  in  equation  (6-3),  equation  (6-5),  and  equation  (6- 
17)  it  is  clear  that  with  these  devices  operating  in  subthreshold  the  only  way  to  reduce  off- 
set mismatches  in  the  learning  phase  is  to  reduce  the  subthreshold  current  constant  varia- 
tions contained  in  I07  and  Iq8.  Realistically,  the  only  way  to  reduce  these  variations  is  to 
increase  transistor  size  which  would  also  have  the  added  benefit  of  increasing  the  gain. 
From  the  same  process  run  above,  data  measurements  computing  the  Early  voltage  for 
several  NMOS  and  PMOS  devices  sized  at  W=6  pm  x L=25  pm  were  collected  from 
which  VA2  = 125.9  V and  VA6=  VA4  = 40.6  V which  would  result  in  Atamp  = 5108; 
nearly  tripling  the  transamp  gain. 

Floating-gate  buffer  learning 

To  fully  realize  the  chip’s  potential,  the  floating-gate  buffer  must  first  be  ‘trained’ 
to  cancel  offsets.  Learning  begins  by  ‘erasing’  or  removing  stray  charges  left  on  the  float- 
ing-gate node  during  fabrication  using  Fowler-Nordheim  tunneling.  This  is  achieved  by 
applying  a large  voltage,  35  to  40  volts,  to  the  tunneling  transistor,  M25,  drain  and  source 
regions  for  several  minutes  [73],  During  this  phase,  the  injection  transistor  supply  volt- 
age, VDD2,  is  turned  off  to  avoid  stray  injection.  After  a sufficient  time,  the  tunneling 
voltage  is  removed  and  this  line  is  grounded.  Tunneling  removes  electrons  residing  on  the 
floating-gate  node  thereby  raising  the  floating  node  voltage  towards  the  positive  supply 
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voltage.  This  forces  the  amplifier,  formed  by  transistors  M22  and  M23,  output  low  conse- 
quently setting  the  learning  transistor,  M6,  gate  voltage  low.  At  this  point,  all  current 
drawn  by  Ml  subsequently  flows  through  M6  leaving  no  current  flowing  through  the  cur- 
rent mirror.  Therefore  the  transamp  output  is  drawn  low  by  currents  flowing  through  M2. 

Offset  calibration  or  learning  proceeds  by  putting  the  buffer  into  the  learning  mode 
by  properly  setting  the  m and  mi  control  lines  and  turning  on  VDD2.  This  enables  the  feed- 
back loop  controlling  the  injection  transistors  drain  voltage.  Next  a DC  bias  level,  Vref,  is 
introduced  onto  the  transamp’s  negative  input  terminal  which  is  chosen  based  on  the  cir- 
cuit’s input  common-mode  range.  The  receptor  inputs  are  then  introduced  onto  the  tran- 
samp positive  input  terminal  while  being  illuminated  with  a constant,  uniform  light 
source.  Artificial  bias  inputs  can  also  be  introduced  instead  of  the  receptor  inputs  for  test- 
ing or  calibration. 

This  ‘logic  level’  output  from  the  transamp  is  used  to  set  the  third  inverter  stage 
output  high  thereby  applying  VDD2  to  the  injection  transistor.  VDD2  is  separated  from  the 
other  supply  voltages  so  the  drain  voltage  on  M24  can  be  controlled  to  avoid  damaging  the 
injection  transistor.  VDD2  can  also  be  used  to  control  the  injection  current  level.  While  in 
learning,  electrons  are  injected  onto  the  floating-gate  node  by  hot-electron  injection  [75] 
thus  slowly  reducing  the  floating  node  voltage.  As  the  floating-gate  voltage  decreases,  the 
output  voltage  from  the  M22-M23  amplifier  increases  thereby  decreasing  the  current  sup- 
plied by  M6.  This  forces  the  residual  current  to  flow  through  the  current  mirror.  This  pro- 
cess continues  until  the  transamp  output  voltage  surpasses  the  first  stage  inverter’s 
threshold  voltage,  subsequently  drawing  the  output  from  the  third  inverter  stage  to  ground 
which  stops  the  electron  injection  process  and  completes  the  learning  phase. 
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Figure  6-5.  Floating-gate  buffer  array  output  after  learning.  Trace  1 shows  the  array 
sync  pulses  while  trace  two  shows  the  buffer  outputs.  The  input  signal  for  all  buffers 
was  a 2.5V  bias  input  voltage. 


The  result  is  the  transamp  output  is  now  set  to  approximately 
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This  allows  the  transamp  output  DC  operating  point  to  be  set  independently  from  the 
receptor  input  thereby  tailoring  the  operation  to  best  suit  subsequent  processing  circuitry 
input  requirements.  Also,  the  Vref  signal  was  required  since  a known,  stable  signal  is 
required  to  drive  both  the  receptor  and  buffer  outputs  to  a known  voltage.  This  is  the  only 
way  to  cancel  both  the  receptor  and  buffer  offsets  simultaneously  in  this  design.  Figure  6- 
5 shows  the  output  from  a one-dimensional  floating-gate  buffer  array  after  the  learning 
operation.  The  input  to  the  non-inverting  transamp  input  terminal  for  all  buffers  was  a 
2.5V  bias  input  signal  so  the  buffer  offsets  could  be  measured  after  learning.  Trace  1 
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Figure  6-6.  Floating-gate 
buffer  offset  voltage 
distribution  and 

histogram.  The  offset 
voltage  to  one  standard 
deviation  is  4.05  mV 
which  is  much  higher  than 
in  the  implementations 
reported  in  Chapter  5. 


shows  the  Sync  line  indicating  the  beginning  and  end  of  the  array  while  Trace  2 shows  the 
transamp  outputs.  Again  one  can  see  the  clocking  noise  at  the  beginning  and  middle  of 
each  receptor  output  due  to  noise  from  the  single-phase  shift  register. 

Figure  6-6  shows  the  Gaussian  distribution  and  histogram  of  offset  voltages  asso- 
ciated with  the  floating-gate  transamp  with  the  bias  current  set  to  5 nA.  The  offset  voltage 
to  one  standard  deviation  was  4.05  mV  which  is  much  higher  than  the  levels  reported  for 
many  implementations  in  Chapter  5.  There  are  several  points  of  interest  associated  with 
these  offset  results.  First,  as  discussed  in  the  previous  section,  much  of  this  deviation  is 
due  to  inverter  threshold  variations  and  finite  gain  of  the  transamp.  Second,  the  layout 
used  for  the  transamp  was  single  W=6  pm  x L=6  pm  transistors  for  the  devices  Ml , M2, 
M3,  M4,  M5,  M6,  M9,  M10  shown  in  Figure  6-3.  All  other  devices  were  minimum  or 
near-minimum  geometry.  Although  the  offsets  before  correction  could  not  be  character- 
ized, this  implementation  should  produce  near  worst  case  variations  in  transamp  random 
offset  voltages  relative  to  the  measurements  shown  in  Chapter  5 due  to  the  simple  layout. 
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Figure  6-7.  Floating-gate  buffer  array  with  photoreceptor  inputs.  In  this  figure,  the 
photoreceptor  outputs  have  been  superimposed  on  the  trained  buffer  positive  input 
terminals,  i.e.  the  buffers  have  not  been  trained  with  the  receptor  outputs  yet. 


One  should  also  note  that  in  performing  the  offset  correction,  this  floating-gate  design  not 
only  corrects  for  random  offset  voltages  but  also  systematic  offset  voltages. 

To  characterize  the  buffer  operation  in  conjunction  with  the  receptors,  Figure  6-7 
shows  the  array  output  when  the  receptor  inputs  have  been  superimposed  on  the  learned 
buffer  outputs.  Therefore  the  receptor  offsets  have  not  been  learned,  the  buffers  have  only 
been  trained  to  cancel  buffer  offsets.  The  receptors  have  been  illuminated  such  that  their 
outputs  are  approximately  in  the  middle  of  the  operating  range.  The  exact  illumination 
intensity  could  not  be  determined  since  the  equipment  used  for  characterization  in  Chapter 
4 was  no  longer  available  for  use.  Based  on  the  photo-optic  measurements  in  Chapter  4, 
the  approximate  intensity  was  .5  Foot-Lamberts. 
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Figure  6-8.  Gaussian 
distribution  and  histogram 
for  offsets  associated  with 
the  receptors  and  floating- 
gate  buffers.  The  offsets  to 
one  standard  deviation  are 
4.09  mV. 


Receptor  and  Floating-Gate  Buffer  Offsets 


Offset  measurements  associated  with  Figure  6-7  are  shown  in  Figure  6-8.  The  off- 
sets to  one  standard  deviation  are  4.09  mV.  There  are  several  interesting  features  in  Fig- 
ure 6-8.  First  the  offsets  are  not  significantly  higher  than  those  shown  in  Figure  6-6, 
implying  that  offsets  in  the  receptors  are  low  in  comparison  to  offsets  in  the  buffers.  It 
was  shown  that  these  Lateral  Bipolar  Receptors  possess  low  random  offset  variations  in 
Chapter  4.  Second,  as  is  the  case  with  these  measurements,  the  random  distribution  of  off- 
sets between  the  receptors  and  buffers  nearly  cancel  one  another  as  can  be  seen  by  exam- 
ining the  change  in  signal  levels  from  Figure  6-5  to  Figure  6-7. 

Figure  6-9  shows  the  floating-gate  buffer  outputs  after  the  receptor  inputs  have 
been  applied  and  the  buffers  trained.  The  receptors  were  illuminated  by  placing  an  incan- 
descent light  source  in  the  same  room  as  the  chip  and  exposing  the  chip  to  light  without  a 
lens  on  top  so  that  only  diffuse  lighting  is  incident  on  the  receptors.  As  can  be  seen  from 
the  Trace  2 information  at  the  bottom  of  the  figure,  the  random  offset  variations  have 
increased  significantly  since  Trace  2 had  to  be  increased  to  10  mV/division  to  get  the 


108 


Figure  6-9.  Floating-gate  transamp  outputs  with  receptor  inputs  after  learning.  The 
receptor  outputs  with  uniform  illumination  have  been  applied  to  the  floating-gate 
buffers  and  the  buffers  were  trained.  The  offset  levels  have  increased  as  can  be  seen 
from  the  information  at  the  bottom  of  the  figure  since  the  Trace  2 settings  have 
increased  to  10  mV/division. 

entire  picture  on  the  oscilloscope  screen.  This  is  an  increase  from  5 mV/division  in  the 
previous  figures.  The  author  believes  the  reason  for  the  increased  offset  variations  is  that 
the  lighting,  although  diffuse,  was  not  constant  and  uniform  across  all  receptors.  Any 
variation  in  intensity  from  one  receptor  to  the  rest  would  cause  a variation  of  this  nature. 
Unfortunately  as  mentioned  previously,  the  equipment  used  to  characterize  the  receptors 
in  Chapter  4 was  no  longer  available  for  use  and  therefore  the  receptors  could  not  be 
exposed  to  a high  quality,  uniform,  calibrated  light  source.  This  author  also  believes  that 
with  the  proper  calibration  scheme,  these  receptor  offsets  can  be  reduced  to  the  levels 
reported  for  the  buffers  only.  Due  to  the  nature  of  the  offsets  and  since  the  data  would  not 
actually  characterize  receptor  or  buffer  offsets,  measurements  characterizing  the  offset 
standard  deviation  were  not  taken. 
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Floating-gate  buffering  operation 

Once  the  learning  operation  is  complete,  the  m and  m control  signals  are  reversed 


to  put  the  transamp  into  a follower  configuration,  thereby  enabling  normal  buffering  oper- 
ations. The  transamp  then  acts  as  a voltage  follower,  driving  the  receptor  outputs  into  the 
HRes  filtering  circuitry.  During  this  phase,  the  VDD2  supply  is  grounded  along  with  the 
tunneling  line  to  keep  any  stray  charge  transfers  from  occurring.  In  the  implementation 
shown  in  Figure  6-3,  VDD1  is  connected  to  both  the  first  and  second  inverter  stages.  In 
future  implementations  it  is  desirable  to  make  this  another  special  supply  which  can  be 
grounded  during  normal  operations  or  when  the  buffering  mode  is  entered.  Power  con- 
sumption of  this  transamp  under  buffering  operations  is  approximately  is  50  nW  with  a 
bias  current  of  5 nA  including  power  consumption  of  the  floating-gate  amplifier.  There- 
fore total  power  consumption  for  the  array  of  26  devices  implemented  on  chip  was  1.3 
pW. 

FTRes  Spatial  Filters 

The  floating-gate  buffers  drive  a one-dimensional  array  of  HRes  circuits.  A thor- 
ough discussion  of  these  circuits  was  presented  in  Chapter  2 and  therefore  will  not  be 
repeated  here.  In  brief,  these  circuits  act  as  spatial  filters,  sharing  input  voltage  signals 
among  neighboring  pixels.  The  filtering  is  controlled  by  an  off-chip  bias  voltage  which 
controls  the  HRes  bias  current. 

Non-Nearest  Neighbor  Differencing  Network 

The  NNND  network  implemented  on  this  chip  is  exactly  like  the  one  presented  in 
Chapter  3.  The  spatial  separation  was  selectable  from  D=0  to  D=7  allowing  the  floating- 
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gate  edge  detectors  to  train  to  their  own  offsets  with  D=0  while  capable  of  computing  dif- 
ferences with  receptors  seven  pixels  away.  The  separation  constant  is  chosen  through  a 
three-to-eight  line  decoder  integrated  on-chip  to  reduce  the  number  of  input/output  pins. 
Since  this  differencing  network  was  passive  in  nature,  it  did  not  consume  any  power  other 
than  parasitic  losses  which  are  neglected  for  our  power  consumption  calculations.  For  an 
in-depth  discussion  on  the  NNND  the  reader  is  encouraged  to  refer  to  Chapter  3. 

Floating-Gate  Edge  Detectors 

The  next  element  in  the  processing  array  is  the  floating-gate  edge  detectors.  Just 
as  with  the  buffers,  the  edge  detection  elements  have  been  designed  with  floating-gate 
amplifiers  to  sample  and  cancel  their  offsets  for  improved  edge  detection  capabilities.  The 
absolute-value  edge  detection  element  shown  in  Figure  5-2b  has  been  redesigned  incorpo- 
rating two  floating-gate  elements  as  shown  in  Figure  6-10.  In  Figure  6-10,  transistors  Ml 
through  M13  form  the  modified  wide-range  transamp  performing  the  absolute- value  edge 
detection  function.  Transistors  M14  through  M29  form  one  floating-gate  amplifier  while 
transistors  M30  through  M45  form  the  second.  Lastly,  transistors  M46  through  M51  per- 
form an  analog-to-digital  conversion  on  the  transamp  outputs. 

Recall  that  this  wide-range  transamp  design  was  modified  by  changing  the  current 
mirror  normally  formed  by  transistors  Mil  and  M 12  and  placed  a bias  threshold  voltage, 
VThres,  at  their  gates  instead  for  setting  the  edge  detection  threshold.  To  the  right  of  the 
edge  detector  are  two  separate  floating-gate  amplifier  circuits,  one  for  each  side  of  the 
transamp.  Two  floating-gate  circuits  are  necessary  since  the  two  sides  of  the  transamp  act 
independently  from  one  another.  VThres  essentially  separates  the  two  sides  turning  the 
amplifier  into  a differential  absolute-value  circuit.  Separating  the  sides  means  that  any 
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Figure  6-10.  Floating-gate  edge  detector  topology  incorporating  a modified  wide-range 
transamp  and  dual  floating-gate  amplifier  circuits  for  offset  correction.  The  dual 
floating-gate  amplifiers  are  needed  for  offset  correction  in  both  sides  of  the  transamp 
since  they  operate  independently.  Finally,  transistors  M46  through  M51  perform  an 
analog-to-digital  conversion  for  off-chip  processing. 
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Figure  6-11.  Modified  wide- 
range  transamp  used  as 
absolute- value  edge  detection 
circuit.  Assuming  all  devices 
are  matched  and  that  all 
differential  input  signals  are 
equally  divided  between  the 
inputs,  the  circuit  can  be 
divided  into  equal  halves  for 
computing  the  differential- 
mode gain. 


Vddl 


type  of  single-sided  offset  correction  would  likely  not  capture  the  offsets  associated  with 
transistors  Ml  1 and  M12  thereby  forcing  a double-sided  offset  correction  mechanism. 

As  with  the  floating-gate  buffers  discussed  previously,  gain  in  the  wide-range 
absolute-value  circuit  needs  to  be  high  to  overcome  offsets  in  the  floating-gate  amplifier 
inverter  stages.  An  analysis  of  the  transamp  gain  can  be  performed  just  as  with  the  buffer 
circuits.  First  if  one  assumes  that  all  devices  are  matched,  then  the  circuit  can  be  broken 
as  shown  in  Figure  6-11.  Thus  for  pure  common-mode  inputs 

ij  ~ 13  = i2  ~ 1 4 (6-19) 

and 


^Thresl  * Thres2  (6-20) 

Therefore  the  following  relationships  can  be  written: 

iouti  = h ~ i'Thresl  (6-21) 


and 
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1ok/2  U ~ Z Thresl  (6-22) 

To  compute  the  gain,  we  first  determine  the  transamp  transconductance,  Gm,  as 


V, 


and  thus 


where 


*i  ~ Smi"2"  “ l3 


■ - X_d  • 

* out]  §ml  2 * Thresl 


v Thres  . 


i Thresl  ~ ^oe 
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(6-23) 


(6-24) 


(6-25) 


or  simply  the  subthreshold  current  equation  set  by  VThres.  Since  the  threshold  current  is  a 
constant,  Gm  can  be  determined  from  equation  (6-24)  to  be 

r,  loutl  §ml 

= -yj  = — («-26) 

Next  the  output  resistance  can  be  computed  by  the  parallel  combination  of  resis- 


tances 


R07  II  R 


oil 


where 


^07  - r0?[l  +Sm7(l  +rl7)r09] 
and 


^oii  ~ ron 

Therefore  the  output  resistance  can  be  shown  to  be 

nX9Ut  + 1 

out  Io7n^9^7Ut  + ^dii^h  + 1 + q7] 

yielding  a simplified,  single-sided  transamp  gain  of 

1 nX,9Ut+ 1 

aeamp  = 2HUtnA,9^7Ut  + ?i11[n?i9Ut+  1 + ri7] 


(6-27) 


(6-28) 


(6-29) 


(6-30) 


(6-31) 


114 


Operation  in  the  edge  detection  circuits  is  much  the  same  as  in  the  floating-gate 
buffers.  First  all  floating-gate  circuits  are  erased  by  applying  a 35V-to-40V  tunneling 
voltage.  Next,  the  transmission  gates  are  put  into  learning  mode  by  properly  setting  the  m 
and  m lines  and  turning  on  the  Vdd2  power  supply.  In  this  case,  the  same  potential  is 
applied  to  both  input  terminals  during  learning  versus  the  two  different  potentials  used  in 
the  buffers.  Therefore  assuming  matched  devices,  the  currents  flowing  through  both 
halves  of  the  transamp  are  equal  and  when  erased  all  mirror  currents  are  flowing  through 
M29  and  M45.  Thus  the  threshold  transistors  draw  the  output  voltages  to  ground  since  no 
current  is  flowing  through  the  current  mirrors.  This  sets  the  edge  detectors  into  learning 
mode;  hot-electron  injection  then  proceeds  in  the  floating-gate  amplifier  until  each  half  of 
the  edge  detector  surpasses  the  inverters  threshold  voltage.  Offsets  for  both  halves  of  the 
circuit  are  then  learned  independently. 

At  this  point  the  VDD2  power  supplies  are  turned  off  and  the  m and  m signal  lines 
are  reversed  so  that  the  amplifier  can  enter  normal  operation.  The  circuits  are  now  ready 
to  perform  edge  detection  functions.  Power  consumption  is  slightly  dependent  upon  the 
VThres  setting,  but  assuming  all  branch  currents  to  be  10  nA  the  total  detector  consumption 
would  be  300  nW  including  the  floating-gate  amplifiers.  Therefore,  the  twenty-six  cir- 
cuits implemented  on-chip  would  consume  a total  of  7.8  pW. 

Shift  Registers  and  Support  Circuitry 

The  remaining  components  in  the  processing  array  consist  of  shift  registers  to 
time-multiplex  the  analog  information  onto  a single  output  line,  off-chip  buffers,  and  a 
three-to-eight  line  decoder.  The  shift  registers  are  single-phase  devices  which  enable  a set 
of  switches  in  each  cell  to  output  the  sync  pulse,  the  output  from  the  HRes  circuits,  and  the 
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filtering  was  applied  for  this  measurement  and  nearest  neighbor  differencing  was  used 
for  computing  the  edge  location.  Trace  1 corresponds  to  the  imager  sync  line,  Trace  2 
corresponds  to  the  HRes  output,  and  Trace  3 corresponds  to  the  edge  detector  outputs. 


output  from  the  edge  detectors.  These  signals  drive  the  positive  input  terminal  of  a volt- 
age follower  used  to  drive  the  pad  frame  and  off-chip  loads.  Lastly,  a three-to-eight  line 
decoder  was  implemented  to  select  the  sampling  constant,  ‘D’. 


Detector  Performance 

With  all  the  components  described,  let  us  now  look  at  the  array  performance  as  a 
whole.  The  results  shown  in  this  section  are  captured  oscilloscope  plots  from  the  func- 
tioning array  under  simulated  inputs.  Simulated  inputs  are  used  since  this  is  the  only  way 
to  quantitatively  characterize  the  detector  performance.  To  begin,  Figure  6-12  shows  the 
floating-gate  array  output  to  a 10  mV  step  input  signal  after  training  both  the  buffer  and 
edge  detector  circuits.  Trace  1 corresponds  to  the  imager  sync  line,  Trace  2 corresponds 
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to  the  HRes  output,  and  Trace  3 corresponds  to  the  edge  detector  outputs.  This  combina- 
tion of  signal  lines  will  be  used  throughout  the  figures  shown  in  this  section.  In  the  figure, 
no  filtering  has  been  applied  so  nearest-neighbor  differencing  was  used  for  computing  the 
edge  location.  The  1 0 mV  step  was  the  minimum  detectable  signal  by  the  array  in  its  con- 
figuration. 

There  are  several  factors  governing  the  array  performance  shown  in  Figure  6-12. 
First,  the  random  distribution  of  offset  voltages  in  the  receptor,  buffers,  and  edge  detectors 
can  combine  such  that  edge  signals  detected  in  one  direction  must  be  larger  than  signals 
detected  in  the  opposite  direction.  In  this  case,  the  receptors  are  not  used  but  the  buffers  at 
the  middle  of  the  array  have  a several  millivolt  signal  difference  which  combines  to  nega- 
tively affect  edge  detection  performance.  This  is  more  clearly  seen  in  Figure  6-13  where 
the  edge  signal  direction  has  been  reversed  and  a 3 mV  edge  signal  can  be  detected.  Again 
since  there  is  no  filtering,  nearest-neighbor  differencing  is  employed  to  compute  edge 
locations.  Here,  the  random  offsets  in  the  buffers  and  edge  detectors  combine  with  the 
input  signal  to  produce  a signal  large  enough  to  exceed  the  detector  threshold. 

The  previous  two  figures  demonstrate  system  performance  without  employing  fil- 
tering or  Non-Nearest  Neighbor  Differencing.  Recall  NNND  does  not  improve  the  perfor- 
mance when  uncorrelated  system  offsets  are  present.  Only  when  filtering  is  applied  and 
the  signal  magnitudes  shared  does  NNND  render  benefit.  Figure  6-14  shows  the  mini- 
mum detectable  step  edge  input  signal  is  7 mV  when  the  filtering  constant  is  increased  to 
L = 5 and  nearest  neighbor  differencing  is  used.  When  filtering  is  applied,  the  edge 
detector  thresholds  are  readjusted  to  reflect  the  decrease  in  buffer  offsets  and  allowing  for 
smaller  edges  to  be  detected.  It  is  clear  the  filtering  has  decreased  the  signal  variations 
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Figure  6-13.  Floating-gate  array  output  to  3 mV  step  edge  input  signal.  The  edge  signal 
direction  has  been  reversed  from  that  shown  in  Figure  6-12  and  a small  edge  signal  can 
be  detected.  This  is  due  to  the  random  variation  in  offsets  in  the  buffers  and  edge 
detectors. 


caused  by  the  transamp  offsets  but  as  noted  in  Chapter  3 the  filtering  has  also  spread  out 
the  input  signal. 

Figure  6-15  shows  the  edge  array  output  when  the  edge  signal  input  is  again 
reversed  while  keeping  L = 5 . In  this  case,  however,  D has  been  increased  to  2 yielding  a 
minimum  detectable  signal  of  7 mV.  This  demonstrates  that  the  array  has  the  same  non- 
uniform  response  to  the  edge  input  direction  even  when  filtering  is  applied  since  Figure  6- 
12  shows  that  a 10  mV  edge  is  the  minimum  detectable  signal  when  no  filtering  is  applied 
and  D=1  Again  this  is  a characteristic  of  the  random  offset  distribution  along  the  imaging 


array. 
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To  examine  the  minimum  detectable  signal  response  of  the  detector  array,  the  spa- 
tial separation  constant  was  systematically  increased  and  the  minimum  detectable  input 
signal  level  examined  at  each  step.  The  filtering  constant  was  kept  the  same  at  L = 5 for 
all  measurements.  Figure  6-16  shows  the  first  of  these  measurements  where  D=3  yielding 
a minimum  detectable  signal  level  of  4 mV.  As  noted  in  Chapter  3,  when  the  spatial  sepa- 
ration constant  increases,  the  detected  edge  location  can  move  away  from  the  actual  edge 
location  in  the  array.  Again  this  is  due  to  the  random  offset  variations,  the  amount  of  sig- 
nal applied  to  the  edge  detector,  the  spatial  sampling  constant,  and  the  direction  of  the 
input  signal. 

Figure  6-17  shows  how  the  random  offsets  again  can  combine  to  negatively  affect 
edge  detection  performance.  In  the  figure,  D=4  but  the  minimum  detectable  signal  level  is 
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Figure  6-15.  Edge  architecture  output  while  reversing  the  input  signal,  keeping 
L = 5 , and  increasing  the  spatial  separation  to  D=2  yields  a minimum  detectable 
signal  of  7 mV.  This  same  configuration  previously  required  a 10  mV  signal  in 
Figure  6-12  when  no  filtering  was  applied  and  D=1 . 


5 mV.  Essentially  the  offsets  from  the  buffer  and  edge  detector  circuits  have  combined  to 
effectively  decrease  the  signal  magnitude  thereby  increasing  the  required  input  signal 
level  for  detection.  The  performance  degradation  is  not  severe,  however,  because  the  fil- 
ters reduce  the  buffer  variations. 

The  characterization  process  continues  in  Figure  6-18  where  D=5  and  the  mini- 
mum detectable  input  signal  has  been  reduced  to  3 mV.  One  can  also  note  that  a second 
detector  is  beginning  to  respond  just  to  the  left  of  the  detected  edge  signal;  again  this  is  an 
artifact  of  the  offset  distributions  combining  with  the  input  signal.  One  final  array  charac- 
terization is  shown  in  Figure  6-19  where  D=7  and  the  minimum  detectable  input  signal  is 
3 mV.  A plot  showing  D=6  was  skipped  since  the  output  was  the  same  as  in  Figure  6-18 
and  Figure  6-19  at  3 mV. 
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Figure  6-16.  Floating-gate  edge  detector  array  output  showing  a minimum  detectable 
signal  of  4 mV  when  L s 5 and  D=3. 


Detecting  edges  below  3 mV  will  be  difficult  in  the  current  realization  since  the 
input  signal  levels  are  beginning  to  reach  the  offset  distribution  levels.  To  reliably 
improve  performance  further,  offsets  must  be  reduced  in  all  processing  stages.  To  detect 
smaller  edges  without  reducing  the  component  offsets  in  the  current  realization,  the  filter- 
ing must  be  increased  to  effectively  reduce  signal  variations  in  the  buffers  and  receptors. 
In  doing  so,  however,  since  the  input  signal  will  also  be  spread  out  accordingly  among  the 
neighboring  pixels,  the  spatial  separation  constant  must  increase  to  retain  the  signal  mag- 
nitude required  to  exceed  the  detector  thresholds.  This  will  also  effectively  reduce  the  res- 
olution achievable  by  the  array. 


To  note  the  performance  when  the  input  signal  is  reversed,  Figure  6-20  shows  the 
detector  output  when  the  edge  signal  is  reversed  and  D=7.  The  minimum  detectable  signal 
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Figure  6-17.  Detector  output  when  D=4  showing  a minimum  detectable  signal  of  5 
mV.  The  increased  input  signal  magnitude  is  due  to  the  buffer  and  detector  offsets 
combining  to  reduce  the  effective  signal  being  processed. 


level  was  7 mV  demonstrating  the  non-uniform  performance.  Again  in  this  figure,  a sec- 
ond detector  can  be  seen  beginning  to  transition  to  the  right  of  the  detected  signal. 
Another  feature  is  the  drop-out  occurring  in  the  unused  detector  outputs.  Since  one  input 
pin  floats  on  these  detectors,  the  outputs  are  arbitrary. 


Conclusions 

This  chapter  introduced  a floating-gate  amplifier  circuit  which  can  be  used  for  off- 
set correction  in  voltage  buffers  and  comparators.  This  floating-gate  circuit  was  incorpo- 
rated into  both  the  buffer  and  edge  detector  circuits  in  an  attempt  to  reduce  the  effects  of 
offsets.  As  was  detailed,  the  offsets  in  the  buffer  circuits  were  higher  than  the  transamp 
offsets  presented  in  Chapter  6.  The  performance  degradation  is  due  mainly  to  the  offsets 
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signal  of  3 mV.  One  can  note  that  another  detector  is  beginning  to  respond  just  to  the 


left  of  the  detected  input  signal. 


associated  with  the  first  inverter  stage  used  to  control  the  offset  cancellation  circuitry. 
One  way  to  overcome  this  is  to  have  one  training  circuit  which  is  used  for  all  the  buffer 
circuits.  That  way  any  offset  introduced  by  the  feedback  control  circuitry  would  be  intro- 
duced into  all  buffer  circuits  effectively  adding  in  some  small  amount  of  systematic  offset 
to  the  buffer  circuits.  As  discussed  in  Chapter  2,  systematic  offset  is  canceled  in  the  pro- 
cessing circuitry  because  a point-to-point  computation  is  being  performed  on  the  HRes 
outputs. 

Since  the  buffer  offsets  have  increased,  one  may  ask  why  the  edge  detection  capa- 
bility of  this  system  improved  from  approximately  9 mV  in  Chapter  3 to  3 mV  here?  The 
answer  comes  from  reducing  offsets  in  the  edge  detection  circuits.  It  is  difficult  to  mea- 
sure the  edge  detector  offsets  since  they  are  operated  as  comparators  in  an  open-loop 
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Figure  6-19.  Detector  array  output  with  D=7,  the  maximum  attainable  in  the  current 
realization,  showing  a minimum  detectable  input  signal  of  3 mV. 
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Figure  6-20.  Detector  output  when  input  signal  is  reversed  and  D=7  showing  a 
minimum  detectable  input  signal  of  7 mV. 
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mode.  However,  from  previous  measurements  the  author  was  able  to  determine  that  after 
reducing  the  transamp  offsets  to  approximately  2 mV  to  one  standard  deviation  the  domi- 
nant factor  in  limiting  edge  detection  performance  became  offsets  in  the  comparator  cir- 
cuits. This  is  also  demonstrated  by  increasing  the  spatial  sampling  constant  and  not  being 
able  to  reduce  the  minimum  detectable  signal.  It  implies  that  the  array  is  operating  at  the 
edge  of  the  detector  circuit  offsets.  So  by  applying  the  floating-gate  amplifiers  to  the  edge 
detection  circuits,  the  offsets  have  been  reduced  enough  to  improve  the  overall  detector 
array  performance.  A brief  analysis  has  shown  how  offsets  in  the  inverter  stages  affect 
offset  cancellation  performance. 

Lastly,  the  author  believes  that  detector  array  performance  can  be  further  improved 
by  changing  the  feedback  circuit  used  in  the  buffer  and  edge  detector  circuits.  One  solu- 
tion is  to  incorporate  on-chip  one  feedback  circuit  which  is  used  to  sequentially  train  each 
buffer  and  edge  detector  circuit  or  alternately  the  control  signals  can  be  fed  on  and  off- 
chip  to  external  circuitry  which  is  used  to  train  the  buffers  and  comparators.  Incorporating 
the  off-chip  learning  circuitry  would  also  greatly  decrease  the  area  consumption  of  these 
circuits.  Since  a 1 pF  capacitor  is  used  to  store  the  training  information,  these  devices  take 
up  a significant  amount  of  silicon.  It  would  also  allow  the  off-chip  circuitry  to  be  custom 
tuned  to  cancel  any  unwanted  offsets. 


CHAPTER  7 
CONCLUSIONS 


Processing  signals  using  subthreshold  analog  circuits  presents  a circuit  designer 
with  a multitude  of  powerful  tools  but  at  the  same  time  also  presents  a new  set  of  design 
constraints.  This  thesis  has  addressed  processing  architectures  for  computing  edge  detec- 
tion from  images  focused  on  a silicon  chip.  Chapter  1 presented  results  from  numerous 
designs  previously  implemented  for  motion  detection,  edge  detection,  or  other  more  appli- 
cation specific  designs.  All  of  these  circuits  incorporated  analog  signal  processing  on  a 
single  silicon  chip  to  accomplish  a given  task.  It  was  highlighted  in  each  discussion  how 
offsets  in  the  computational  circuits  fundamentally  limited  performance. 

Chapter  2 then  presented  a comparison  of  three  different  architectural  designs  for 
computing  low-contrast  edge  detection.  From  this  discussion,  it  was  shown  that  a Differ- 
enced Gaussian  architecture  fundamentally  possess  a superior  signal  retention  capability 
for  enhancing  low-contrast  edge  detection.  To  quantize  the  architecture’s  performance,  a 
Differenced  Gaussian  realization  in  a 2 pm  Analog  N-well  process  was  shown  to  be  capa- 
ble of  detecting  edges  as  low  as  22  mV.  Chapter  3 improved  this  architecture’s  edge 
detection  capability  by  presenting  a Non-Nearest  Neighbor  Differencing  technique  which 
can  be  used  in  silicon  systems  to  improve  low-contrast  edge  detection  at  the  expense  of 
resolution.  It  also  presented  results  from  a chip  incorporating  the  Differenced  Gaussian 
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architecture  in  conjunction  with  a Non-Nearest  Neighbor  Differencing  network  which 
demonstrated  minimum  detectable  signal  levels  as  low  as  9 mV. 

Chapter  4 presented  performance  characteristics  from  several  on-chip  photorecep- 
tors in  both  CMOS  and  bipolar  processes.  A Lateral  Bipolar  Photoreceptor  is  presented 
which  incorporates  lateral  bipolar  transistors  in  a CMOS  technology  to  reduce  receptor 
offsets.  Offset  and  photo-optic  response  measurements  are  presented  detailing  the  supe- 
rior receptor  performance.  It  was  shown  that  the  Lateral  Bipolar  Receptor  possesses  ran- 
dom offset  variations  below  2 mV  to  one  standard  deviation  while  possessing  a dynamic 
range  exceeding  7 orders  of  magnitude.  In  addition,  performance  results  from  two  recep- 
tors implemented  in  a modem  0.8  pm  double-polysilicon  bipolar  process  are  presented. 
The  receptor  results  demonstrate  the  performance  characteristics  which  can  be  expected 
from  more  modem,  high-speed  processes. 

Chapter  5 addresses  performance  characteristics  from  the  next  architectural  pro- 
cessing elements  which  are  the  transamps  used  to  buffer  receptor  outputs  into  the  HRes 
circuits.  Offset  results  from  ten  different  transamp  realizations  are  presented.  It  is  shown 
that  transamps  possessing  random  offset  variations  below  2 mV  to  one  standard  deviation 
can  be  realized  while  consuming  small  amounts  of  area  and  power.  A logical  conclusion 
drawn  from  the  results  are  that  simpler  designs  possess  superior  offset  characteristics. 
However,  this  chapter  was  able  to  present  a quantitative  comparison  and  analysis  of  the 
offset  improvements  which  can  be  expected  from  implementing  the  transamps  using  vari- 
ous transistor  sizes,  layout  configurations,  and  circuit  topologies.  Again  offset  character- 
istics from  a transamp  implemented  in  the  bipolar  process  are  presented  to  gauge  potential 
performance  characteristics  from  designs  implemented  modem  processes. 
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Chapter  6 then  presents  an  edge  detection  architecture  incorporating  floating-gate 
buffers  and  floating-gate  edge  detection  circuits  designed  to  reduce  offsets  and  improve 
edge  detection  capabilities.  As  detailed,  the  offset  performance  of  the  buffer  circuits  was 
not  as  good  as  desired  but  the  overall  detector  array  performance  was  superior  to  previous 
designs.  This  was  mainly  due  to  offset  reductions  in  the  edge  detection  circuits.  Captured 
oscilloscope  plots  detail  array  performance  in  terms  of  minimum  detectable  signal  quanti- 
ties as  the  filtering  is  increased  and  the  spatial  separation  constant  is  varied.  Results  dem- 
onstrate a capability  to  detect  edge  signals  as  low  as  3 mV. 

The  silicon  process  used  to  fabricate  most  of  these  designs  was  a 2 pm  Analog  N- 
well  CMOS  process  which  prompts  the  question  of  how  performance  of  these  type  of 
architectures  may  change  if  implemented  in  more  modem  processes.  First,  modem  pro- 
cesses are  typically  designed  to  improve  processing  speeds.  Thus  doping  levels  are 
increased,  doping  depths  are  decreased,  and  transistor  sizes  are  reduced  in  an  attempt  to 
reduce  circuit  parasitics  and  increase  transistor  switching  rates.  In  reducing  the  transistor 
sizes,  however,  the  device  dimension  error  and  variations  increase.  Parameters  such  as  the 
width  and  the  length  have  increased  variations  in  addition  to  the  smaller  devices  having 
greater  variations  in  other  processing  parameters.  All  of  these  characteristics  lead  to 
larger  random  offset  variations  from  device  to  device  which  would  degrade  the  perfor- 
mance of  on-chip,  analog  processing  architectures.  Also,  as  demonstrated  by  the  receptors 
implemented  in  the  bipolar  process,  the  shallower  doping  regions  implemented  in  the 
modem  process  limit  the  photon  generated  minority  gathering  capability  of  these  devices 
which  in  turn  reduces  the  low-contrast  device  performance.  Therefore  the  conclusion  is 
that  the  performance  of  analog  processing  arrays  implemented  in  modem  processes  will 
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likely  degrade  if  no  special  design  considerations  are  made  to  compensate  for  the  process 
limitations. 

Keeping  the  device  sizes  the  same  but  implementing  them  in  modem  processes 
presents  one  with  potentially  superior  dimensional  matching  characteristics  in  terms  of 
length  and  width  variations  but  other  processing  parameter  variations  would  still  increase. 
As  for  overall  offset  performance  in  such  a design,  it  would  depend  upon  whether  the 
dimensional  variations  would  decrease  faster  that  the  doping  variations  would  increase 
which  would  probably  vary  from  process  to  process.  Modem  processes  also  present 
another  consideration  which  is  the  inclusion  of  additional  metallization  layers.  The  addi- 
tional layers  could  potentially  reduce  circuit  sizes  by  increasing  the  routability. 

Future  research  efforts  should  concentrate  on  improving  on-chip  processing  char- 
acteristics in  two  ways.  First,  the  one-dimensional  designs  presented  here  should  be 
implemented  as  two-dimensional  designs  to  explore  processing  characteristics  involved 
with  more  practical  realizations.  The  one-dimensional  designs  demonstrate  the  potential 
for  such  processing  architectures  but  two-dimensional  realizations  should  be  implemented 
to  examine  the  processing  characteristics  of  more  practical  circuits.  Second,  the  floating- 
gate  circuits  demonstrated  in  Chapter  6 have  shown  a potential  for  vastly  improved  system 
performance  by  incorporating  on-chip  learning  for  offset  cancellation.  This  technology 
should  be  addressed  further  in  the  areas  of  redesigning  the  buffers  and  edge  detectors  to 
incorporate  a single  on-chip  or  off-chip  feedback  circuit  which  can  sequentially  train  each 
of  the  buffer  and  comparator  circuits  to  achieve  better  matching. 
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